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CHAPTER 1 


INTRODUCTION 


1.1 PURPOSE 


The purpose of this manual is to provide the information necessary to 
understand and use the Array Processor (AP) Math Library. The Math 
Library contains a versatile set of FORTRAN callable routines for use 
in high-speed array processing. Once these routines are installed in 
the host system, they can be called by standard FORTRAN programs. 


1.2 SCOPE 


This manual is a user document designed to describe the Math Library 
routines and acquaint the user with the unique features of the AP. The 
manual is divided into two parts. 


Part One consists of five chapters and four appendices. The five 
chapters provide general information about the AP and the use of the 
Math Library. Chapter 1 presents introductory material, including 
basic concepts about AP processing and Math Library use. Chapter 2 
provides general operating information necessary for the most efficient 
use of the Math Library routines. It includes information about memory 
organization, format conversion and speed considerations. It also 
defines a general procedure for program development. Chapter 3 defines 
the categories into which the Math Library routines are organized. 
Chapter 4 presents a number of detailed examples of array processing 
programs written with routines from the Math Library. Chapter 5 
describes the FORTRAN Math Library Simulator (MATHSIM) and its use. 


The four appendices are designed to provide quick and easy access to 
more detailed information about any one of the more than 150 Math 
Library routines. This information includes what each routine does, 
how to use it, and how fast it runs. Appendix A lists the Math Library 
routines alphabetically. Appendix B lists the routines by. type and 
page order. Appendix C gives an abbreviated summary of each routine 
and defines its purpose and its calling parameters. Appendix D lists 
the routines available for use in AP=-FORTRAN program units and their 
AP-FORTRAN calling names. . 


Part Two consists of four appendices. Appendix E provides complete 
reference material about each routine. Appendices F, G, and H are 
actually identical to Appendices A, B, and D, respectively; they are 
repeated in Part Two for easy reference. 
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If more information is desired on the AP, the reader should refer to 
the manuals listed in Table I1-l. 


Table 1-1 Related Manuals 


MANUAL PUBLICATION NQ. 
Processor Handbook FPS 860-7259-003 
Programmer's Reference wa Parts One and Two FPS 860-7319-000 
FORTRAN Reference Manual FPS 860-7408-000 
APAL Reference Manual FPS 860-7412-000 
APLOAD Reference Manual FPS 860-7410-000 
APOBUG/APSIM Reference Manual FPS 860-7364-002 
APEX Manual FPS 860-7371-001 
AP Diagnostic Software Manual | FPS 860-7284-002 


0849 


1.3 AP HARDWARE 


This section is included to give the user a general overall picture of 
the structure of the AP and some insights into why it can process 
arrays at such high speeds. It is not, however, necessary to know this 
information in order to write programs with the Math Library. 


1.3.1 BASIC ARCHITECTURE 


The AP uses a general-purpose, multi-bus oriented architecture. The 
floating adder and floating multiplier are each connected directly to 
each of the memory elements and registers in the AP through separate 
parallel 38-bit data paths. This parallel structure allows the 
overhead of array indexing, loop counting, and data fetching from 
memory to be performed simultaneously with the arithmetic operations on 
the data. Much faster program execution is possible as opposed to 
using a typical general-purpose computer where each of the above 
operations must occur sequentially. 
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Specifically: 


@ Programs, constants and data each reside in separate, 
independent memories to eliminate memory accessing 
conflicts. 


@ Independent floating-point multiplier and adder 
units allow both arithmetic operations to be 
initiated every 16/7ns. 


e Two large blocks (32 locations each) of floating- 
point accumulators are available for temporary 
storage of intermediate results from the multiplier, 
adder or memory. 


e Address indexing and counting functions are performed 
by an independent integer arithmetic unit that includes 
16 integer accumulators. 


In a typical application, such as a Fast Fourier Transform (FFT), the 
above features allow nearly the entire computation to be overlapped 
with data memory access time. 


Effective processing precision is enhanced by 38 bits of internal data 
width, an internal floating-point format with optimum numerical 
properties, and a convergent rounding algorithm. 


1.3.2 SYSTEM OVERVIEW 


The AP is connected to the host in a manner that permits data transfers 
to occur under control of either the host computer or the AP (refer to 
Figure 1-1). For most host computers, this means that the AP is 
interfaced to both the programmed I/O and DMA channels. 
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Figure 1-1 AP Block Diagram 
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The system elements are interconnected with multiple parallel paths so 
that transfers can occur in parallel. All internal floating-point data 
paths are 38 bits in width (10-bit biased binary exponent and 28-bit 
2’s complement mantissa). The main data memory (MD) is organized in 8K 
and 32K-word modules of 38-bit words each, expandable up to 512K words 
in the main chassis. Effective memory cycle times (interleaved) of 
either 167ns or 333ns are available. 


The table memory (TM) is used for storage of constants and is tied to a 
separate data path so as not to interfere with data memory. It is 
bipolar, 167ns read-only memory, and is organized in 512-word, 38-bit 
increments. An optional random access memory (TMRAM) is also 
available. 

The program source memory (PS) can hold from 512 to 4096 64-bit 
instruction words. 


Data pad X (DPX) and data pad Y (DPY) are two blocks of 32 floating 
accumulators each. Each is a two-port register block wherein one 
register may be read, and another written from each block in one 
instruction cycle. = 


The floating adder (FA) consists of two input registers, Al and A2, and 
a two-stage pipeline which performs the operations and convergently 
rounds the normalized result. 


The floating multiplier (FM) consists of two input registers, Ml and 
M2, and a three-stage pipeline which performs the multiply operation. 
Products are normalized and convergently rounded 38-bit numbers. 


The s-pad consists of sixteen 16-bit integer registers and an integer 


arithmetic unit which is used to form operand addresses and to perform 
integer arithmetic. 
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1.3.3 EXAMPLE OF AP OPERATION 


The following example shows the sequence the AP goes through to add two 
vectorse 


The initial conditions for this sequence are that the program to add 
two vectors resides in the AP program source memory and the two vectors 
to be added reside in the host memory. 


1. The host calls the AP executive program (APEX) to request 
host DMA cycles to transfer the two vectors from the host 
memory to the AP main data memory. The two vectors are 
converted from host floating-point format to the AP 
floating-point format on the fly as they pass through the 
formatting hardware of the interface. 


2. The host calls APEX to start the AP vector add routine. 
The routine is performed with the resultant vector 
remaining in the AP format. This format yields the 
benefit of 38-bit precision and convergent rounding during 
the critical phases of processing. 


3. The host calls APEX to request host DMA cycles to transfer 
the resultant vector back to the host memory. The vector 
is converted from AP format to host floating-point format, 
again on the fly. 


4. The AP proceeds to another process or stops executing, 
depending on previously established conditions. An 
interrupt to the host can be issued. 


A detailed discussion of this example is given in section 2.3. it is 
given from a programming viewpoint and includes a commented FORTRAN 
program. 
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1.3.4 FURTHER HARDWARE CONSIDERATIONS 


The AP is most efficient when a sequence of operations can be performed 
on one or more vectors, or on a whole array which resides in the main 
data memory. This approach reduces data-transfer overhead and retains 
maximum numerical precision. A reasonable sequence, for example, would 
be to transfer a trace and a filter, FFT both, array multiply, inverse 
FFT, and transfer the result back to the host memory. 


The AP main data memory has DMA capability. This means that the 
interface can steal main data memory cycles from the AP microprocessor. 
This capability allows the host computer DMA-to-AP DMA data transfers 
to occur, thereby minimizing both host and AP overhead. 


The AP has been designed with enough built-in flexibility to allow its 
power to be harnessed in a variety of wayse Refer to the AP Processor 
Handbook (FPS 860-7259-003) for detailed descriptions of the elements 

of the AP presented in this discussion. 


1.4 AP SOFTWARE 


Four software packages are supplied with the AP to assist the user in 
running programs, writing programs, and diagnosing hardware faults. 


1.4.1 THE EXECUTIVE 


The AP executive (APEX) allows the user to communicate with the AP via 
FORTRAN or host assembly language calls. It is a subroutine linked 
into FORTRAN programs which use the array processor. The APEX driver 
subroutine interprets the particular user call and directs the AP to 
perform the specified: acticn. Both the AP Math Library routines and 
user-developed AP programs may be called from the host computer using 
APEX. 
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1.4.2 THE AP MATH LIBRARY 

The AP Math Library (APMATH) includes over 235 floating-point routines 
which cover a wide range of array processing needs. These routines, 
written in AP assembly language, can be called by programs written 
either in host FORTRAN, host assembly language, or in AP assembly 


language- The purpose of this manual is to describe these routines as 
follows: 


e data transfer and control operations 
s basic vector arithmetic 

@ vector=-to-scalar operations 

8 vector comparison operations 

e complex vector arithmetic 

® data formatting operations 

e matrix operations 

e FFT operations 

e auxiliary operations 

e APAL callable utility operations 
@ signal processing operations 


® table memory operations 
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1.4.3 PROGRAM DEVELOPMENT PACKAGE 


This package provides four FORTRAN IV programs which are compiled on 
the host computer during installation, and are for use in writing array 
processing programs and subroutines in the AP assembly language. The 
programs are as follows: 


APAL AP assembler is a cross-assembler that provides 
a two-pass assembly of AP symbolic assembly 
language coding into an object module. APAL 
generates detailed error diagnostics. 


APLOAD APLOAD links and relocates separate APAL and 
AP-FORTRAN object modules together into a 
a single load module. 


APSIM AP simulator (APSIM) provides a programmed 
simulation of the various hardware elements of 
the AP. All timing characteristics of the AP 
are emulated, and the floating-point arithmetic 
is simulated (including rounding) to the least 
significant bit. APSIM is a convenient tool 
in bringing up new AP programs off-line without 
interfacing with production runs. 


APDBUG APDBUG is an interactive debugging program 
with commands similar to APSIM. The user may 
selectively set breakpoints, examine and 
change memory and register contents, and run 
program segments. 


The AP Programmer’s Reference Manual (FPS 860-7319-000) is a 
comprehensive instruction manual which describes developing programs 
using the AP Program Development Package. 
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1.4.4 DIAGNOSTIC PACKAGE 


The AP test programs are a collection of interactive diagnostic test 
and verify programs that aid in isolation of hardware faults. They 
include: 


APTEST AP test exercises the panel, DMA interface, 
and various internal registers and memories. 
It tests main data memory with simple patterns 
and then with random numbers. Board level 
diagostic indicators are provided. 


APPATH AP path test tests the various internal 
data paths and gives board-level diagnostics. 


APARTH AP arithmetic test tests the floating-point 
adder, multiplier, and s-pad arithmetic unit 
with pseudo-random number and operation 
sequences. 


FIFFT Forward/Inverse FFT test verifies the correct 
operation of the AP as a complete unit by doing 
forward/inverse FFT transforms on both spikes 
and random number sequences. 
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CHAPTER 2 


GENERAL OPERATION 


2.1 INTRODUCTION 


This section gives the basic information required to use the AP Math 
Library routines with host FORTRAN programs in order to process data 
with the AP. Miscellaneous information about the structure and 
operation of the AP is also included to help the user get the most 
efficient use of the AP. 


2.2 ARRAYS, VECTORS AND SCALARS 

The terms array and vector are used somewhat interchangeably when 
discussing array processing. There is, however, a difference between 
an array and a vector. 


An array is a group of numbers that are related to each other in some 
way. An array of numbers often has a multi-dimensional aspect to it. 
A matrix, for example, is an array. Another kind of an array is a 
table of numbers, such as a table of several parameters ~- all related 
to one system or measurement. 


A vector in array processing terminology refers to a one-dimensional 
sequence (string) of numbers. The columns of a matrix or table are 
vectors. In this sense, a vector is essentially a subset of an array, 
i.e-, a string of numbers that are all values for the same parameter. 
When organizing an array for processing, the user usually divides the 
array into vectors and establishes one vector for each column of data. 


Array processing often involves performing a relatively simple 
operation or algorithm repetitively on long sequences of data 
(vectors). The strength of the AP is that it is designed to perform 
such operations at much faster speeds than is possible with a general 
purpose processor. 
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The individual numbers in an array or vector are called elements. A 
vector of only one element is a scalar. Thus, a scalar refers to a 
single number. A vector operation may also involve a scalar (e.g., the 
dot product of two vectors, or the product of each element of a vector 
by a constant). 


To summarize then: 


e An array is a group of numbers. 
e <A vector is a sequence of numbers. 


e A scalar is a single number. 


2-3 PROGRAM FLOW 


Writing a FORTRAN program that calls on the AP to process data is 
basically the same as writing a FORTRAN program that runs exclusively 
on the host processor. Exceptions to this are as follows: 


e The AP and APEX must be initialized before any other 
calls are made to the AP. 


e Data must be transferred from the host memory to the 
AP main data memory before the AP can operate on it. 


e in order to synchronize the operation of the AP with 
the host, wait calls must be inserted in the program 
whenever the host and the AP interact. 


e At the end of program execution, data must be transferred 
from the AP main data memory back to the host memory. 
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Figure 2-1 illustrates the necessary steps to follow when writing a 
FORTRAN program to rum on the AP. The following discussion addresses 
each of these blocks separately. Figure 2-2 illustrates a FORTRAN 
program that directs the AP to add two vectors together. The sequence 
of hardware operations for this procedure is given in section 1.3.3. 
The program in Figure 2-2 is referred to throughout the following 
sections. 


2.3.1 DIMENSION DATA IN HOST MEMORY 


Before an array can be transferred to the AP, it nust be dimensioned 
and stored in the host memory. This is the first step in the example 
in Figure 2-2: 


DIMENSION A(1000), B(1000), C(1000) 


At this point, the user can create vectors to be processed by the AP. 
The DIMENSION command tells the host how many memory words to allocate 
for each vector, and gives each vector a name. The user can then use 
these names to call the data for transfer to the AP. Note that a 
vector C is also created in this example to provide a location in the 
host memory where the sum of the addition of the two vectors A and B 
can be stored. If it is not necessary to preserve a copy of A or B in 
the host, then the result can be stored back into A or B, thereby 
avoiding the additional host memory requirement. 


An alternate method of dimensioning the arrays is to combine both the A 
and the B vector into one 2000-word vector: DIMENSION A(2000), 
C(1000). This eliminates one of the data transfer calls required to 
transfer the two vectors to the AP, and reduces the program run time. 
However, it is a little more complicated for the user to keep track of 
the various vectors in the array. Dimensioning of data is described 
further in the following sections. 


2-3-2 STORING THE ARRAY IN THE HOST MEMORY 


With the array location established in the host memory, the user must 
fill the memory locations with actual data. This means reading in the 
data from a tape drive, an analog-to-—digital converter, a disk drive, 
etcetera. Figure 2-1] illustrates a general flowchart for writing a 
FORTRAN calling program to perform an operation with the AP. 
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IN HOST MEMORY 


STORE DATA IN HOST MEMORY 


INITIALIZE AP 


ALLOCATE AP MAIN DATA MEMORY 


TRANSFER DATA FROM HOST TO AP 
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PROCESS DATA WITH AP 


WAIT 


TRANSFER DATA FROM AP TO HOST 


0851 


Figure 2-1 FORTRAN Calling Program Flowchart 
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In the following example, the vectors are created with an arithmetic 
expression in a DO loop: 


C------ FORTRAN program to add 2 vectors in AP120B and return result to 
C host 
C 
C------ Dimension vectors in host 
Cc 
DIMENSION A(1000),B(1000) ,C (1000) 
C 
C------ Select size of vectors to be added 
Cc 
N=1000 
C 
C------ Somehow create vectors A and B in host 
C 
DO 10 I=1,N 
ACL )@e ccwccce 
10 B(L)=accccoce 
C 
C------ Initialize AP120B (must be done before any other 
C calls to AP120B) 
Cc 
CALL APCLR 
C------ Indicate we’re transferring host floating-point numbers 
IFMT=2 
C 
C------ Allocate AP120B main data memory 
C 
TA=0 
IB=N 
IC=N+N 
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C------ Transfer A and B from host to AP120B main data memory 
C------ A is transferred to locations 0 -— 999, B to 
C locations 1000 -— 1999 


CALL APPUT (A, IA,N, IFMT) 
CALL APPUT (B,IB,N,IFMT) 


C------ Wait until transfer is complete before doing computations 
C 'on data 
CALL APWD 
C 
C--=---= Perform vector addition in AP120B, storing results 
C 2000 -— 2999 
C 
CALL VADD(IA,1,1B,1,IC,1,N) 
C 
C------Wait until calculation is finished before getting results 


CALL APWR 
C----—--Now transfer result from locations 2000 —- 2999 to host buffer C 
CALL APGET(C,IC,N, IFMT) 


C----== Wait until transfer is complete before printing results, etc 
C in host 
CALL APWD 
C 
C------ Print results, etc. in host 
C 
| 
| 
| 
END 


FPS 860-7288-004 2 - 6 


The routines in the AP Math Library operate on four different types of 
vectors or arrays: real vectors, complex vectors, complex FFT vectors 
and matrix arrays. In each of these cases, the routines assume that 
the vector or array is organized in a particular sequence. For 
example, each element of a complex vector requires two memory words: 
one word for the real part of the element and one for the imaginary 
part. The routines for operating on complex vectors assume that the 
parts of each complex element are stored in two consecutive addresses 
in the AP main data memory. 


The initial organization of arrays and vectors should be done when 
dimensioning the host memory and storing data in the host. Refer to 
the discussion of vector organization (section 2.4) for more details on 


the vector formats and variation allowed when organizing vectors for 
processing with the AP Math Library routines. 


2.3.3 INITIALIZING THE AP 


Initially, the AP internal status register and DMA control register 
Must be cleared, and the AP executive (APEX) must be initialized. This 
is done with: 


CALL APCLR 


APCLR must be called before any other calls are made to the AP. 
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2.3.4 ALLOCATING THE AP MAIN DATA MEMORY 


The main data memory in the AP is organized into 38-bit floating-point 
words. The words are consecutively numbered from 0 to N-1, where N is 
the maximum size of the memory: 8192, 16384, etcetera. 


In complex programs where a number of transfers of data between the AP 
and the host are required, and where the arrays being operated on are 
large or numerous, it is recommended that the user take some care in 
allocating the AP memory before proceeding with the program. 


Dimensioning the AP is very simple. The user must establish where each 
vector is to reside in memory and establish an integer constant, 
variable name, or expression that specifies the base address (first 
word) of each vector location. 
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For the example in Figure 2-2, memory allocation is done with the 
following FORTRAN statements: 


TA = 
IB 
Ic = 


it 
ZzAAO 


Vector A is defined as starting at word O and going to word 999 
(N=1000); vector B goes from 1000 to 1999; and the result, vector C, 
is stored from 2000 to 2999. The I that precedes each variable 
indicates that the addresses specified are integer values (standard 
FORTRAN convention). 


Section 2.3.1 suggests arranging the array in the host memory into one 
long vector as a means of reducing program run time. The dimensioning 
of the array into vectors can then be done in the AP with the type of 
memory allocation statements shown previously. 


There is one other consideration in allocating space in the AP main 
data memory. Many of the AP Math Library routines run at different 
speeds depending on the location of the vectors to be operated on in 
the AP main data memory. “Program run time can occasionally be reduced 
by specifying that certain vectors start on either even or odd memory 
addresses. (Refer to section 2.7.2 for further information on memory 
allocation.) 


2-3-5 TRANSFERRING DATA FROM THE HOST TO THE AP 


With these preliminary steps completed, the user can transfer the array 
to be processed from the host memory to the AP main data memory with an 
APPUT command. APPUT has four parameters: 


CALL APPUT (HOST, AP, N, TYPE) 
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HOST specifies the initial element of the data in the host that is to 
be moved to the AP. HOST can be a constant, a variable, an array name 
or an array element. Typically, the HOST parameter consists of the 
name of the first array element to be transferred; for example: A, 
SIGA(50), MATB (101)- Illustrated in Figure 2-2, the HOST parameters 
in the two APPUT calls are A and B: 


CALL APPUT (A, _, _» _) 
CALL APPUT (B, _, _» _) 


The parameter AP specifies the base address in the AP main data memory 
where the data from the host memory is to be stored. AP can be an 
integer, constant, variable, or an expression that specifies an integer 
number; for example: 101, IA, IA + 3*N. In the previous step, the AP 
parameters are generally specified when allocating the AP memory. As 
illustrated in the two APPUT commands in Figure 2-2, the variables IA 
and IB are used for the AP parameters. 


CALL APPUT (A, IA, _, _) 
CALL APPUT (B, IB, _, _) 


It is possible to omit the AP memory dimensioning step and merely use 
integer constants for AP. For example: 


CALL APPUT (A, 0, _, _) 
CALL APPUT (B, 1000, _, _) 


But specifically allocating the AP main data memory at the beginning of 
a FORTRAN program is good programming practice, especially when the 
program has many vector operations. 


N specifies the number of host data elements to be moved from the host 
to the AP. Note that a data element may consist of more than one host 
word. For example, a host floating-point number usually requires two 
host words, but occupies one word in the AP main data memory. Like the 
AP parameter, N can be an integer constant, variable, or an expression 
that specifies an integer number. Earlier in the example program, N 
was specified as being 1000. So, in this example, the variable N is 
used for the N parameter. 


CALL APPUT (A, IA, N, _) 
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The number 1000 could also have been used for N. 


CALL APPUT (A, IA, 1000, _) 


TYPE specifies the host data format and the type of conversion to be 
done between the host and AP during transfer. Format conversion of 
floating-point numbers is done automatically, on the fly, as part of 
the data transfer procedure. No conversion call is required for host 
floating-point numbers other’ than to specify the format with the TYPE 
parameter (2 or 3). ; 


The AP performs arithmetic using a 38-bit floating-point format 
(illustrated in Figure 2-3): one exponent sign bit, nine exponent 
bits, one mantissa sign bit, and 27 mantissa bits. The binary point is 
always located between the mantissa sign bit and the most significant 
bit of the mantissa. (Bits 0, 1 and 40 are parity bits.) 


2 3 11 12 «13 39 


EXPONENT a MANTISSA 
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Figure 2-2 AP Floating-point Format 


TYPE can specify four different kinds of formats and format 
conversions, depending on whether TYPE = 0, 1, 2 or 3. 


When TYPE is 0, 32-bit integers are transferred from the host to the AP 
and stored without format conversion into the low 32 bits of the AP 
memory words (8 through 39). Refer to Chapter 3, Data Formatting 
Commands, for information on using the TYPE 0 and 1 formats. 


When TYPE is 1, 16-bit integers are converted into unnormalized AP 
floating-point numbers. These numbers must be normalized (floated) 
before they can be processed using an AP Math Library routine. FLT is 
the normalizing command. 


Normalization of a floating-point number means the number is adjusted 
so that the most significant bit of the mantissa is located in bit 13 
of the 38-bit word. There is a corresponding adjustment of the 
exponent. . 
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Typically, when TYPE is 2, host single-precision floating-point numbers 
are transferred to the AP and converted into normalized AP 
floating-point numbers. When the AP is installed in a system, it is 
set to convert the type of floating-point format used by the specific 
host. 


Typically, when TYPE is 3, IBM 360 32-bit floating-point format numbers 
are converted to normalized AP floating-point numbers. 


ITilustrated in Figure 2-2, a variable is assigned immediately following 
CALL APCLR to define the floating-point format being used: IFMT = 2. 
This variable is then used for the TYPE parameter in the following 
Statement of the program: . 


CALL APPUT (A, IA, N, IFMT) 


The number 2 could also have been used for TYPE: 


CALL APPUT (A, IA, N, 2) 


223-6 SYNCHRONIZATION 


Two wait commands, APWR and APWD, are available to ensure that the AP 
and the host are synchronized in their operation when required. 


APWD (wait on data) causes the host program to wait until a data 
transfer between the host and the AP (the result of a CALL APPUT or 
CALL APGET) has been completed before the host resumes execution of the 
progran. 


APWR (wait on running) causes the host to wait until the AP has 
finished running before it resumes execution of the program. In 
general, whenever a data transfer command is called following the 
execution of a routine by the AP, an APWR should precede the data 
transfer command. 


The two data transfer routines (APPUT and APGET) both wait (in effect 
CALL APWD) for any previous data transfer to be completed before 
starting a new data transfer. Two APPUT calls can thus be made in 
succession without calling a wait in between. Also, the arithmetic 
operations in the AP Math Library all wait (in effect CALL APWR) for 
any previous arithmetic operation to be completed before starting a new 
operation. 
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APWAIT is a third command that combines the operations of APWD and 
APWR. It causes the host to wait until any data transfer and any 
routine execution are both completed before it continues to execute the 
program. 


The AP host interface is capable of transferring data to and from the 
host while it is processing data. This might be done as a method of 
reducing program run time. The wait commands can be omitted in cases 
where it is certain that the data being transferred and the data being 
processed are not the same. This programming technique should be used 
with caution because it can cause errors in computations. It is good 
programming practice to include the wait calls. Refer to section 22764 
for more information on programming data transfers while the AP is 
processing. 


2-3-7 PROCESSING DATA 


Once the array to be processed is stored in the AP main data memory, 
the user can operate on it with the AP Math Library routines. In this 
example, the corresponding consecutive elements of the two 1000-element 
vectors beginning at addresses IA (=0) and IB (=1000) are added 
together, and the 1000 sums are stored in the AP main data memory 
starting at base address IC (=2000): 


CALL VADD (IA, 1, IB, 1, IC, 1, N) 
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2.3.8 TRANSFERRING DATA BACK TO THE HOST 


When array processing has been completed, the user can transfer the 
resultant array back to the host with an APGET command. The user 
should remember to call the APWR command to be sure the AP is done 
processing before transferring data. APGET uses the same four 
parameters as APPUT. The APGET call is written as follows: 


CALL APGET (C, IC, N, IFMT) 


The resultant 1000-word vector is thus moved from AP main data memory 
locations 2000 to 2999 to the the host memory array C, set up with the 
original DIMENSION command. IFMT is again 2, which means that each 
element of the vector is converted from AP floating-point format to 
host single-precision format. 


When TYPE is O in an APGET command, the low 32 bits of the AP memory 
words are transferred without format conversion to the host memory. 
When TYPE is 1, the low 16 bits of the AP memory words are transferred 
to the host memory. VFIX (refer to Data Formatting Commands in Chapter 
3) can be called prior to this command to convert 38-bit floating-point 
numbers to 16-bit integers. When TYPE is 3, the AP floating-point 
numbers are converted into IBM 360 single-precision floating-point 
numbers and transferred to the host memory. 


If overflow or underflow is detected on conversion from AP format when 
TYPE 2 or 3 format is selected, a signed maximum-quantity is forced on 
overflow and zero on underflow. This occurs because the dynamic range 
of the AP (10*%-153 to 10*%*153) is greater than most host computers. 
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2.4 VECTOR ORGANIZATION 


I, 


This section discusses vector organization. 


2.4.1 REAL VECTORS 


Three parameters are required to define a real vector: a starting (or 
base) address, an address increment, and an element count. The base 
vector address is the AP main data address of the first vector element 
to be operated on- The address increment specifies the interval 
(difference in addresses) between one element of the vector and the 
nexts The element count specifies the number of elements of the vector 
to be operated on (e-g.-, the number of multiplications to be 
performed). For example: 


CALL VMUL(A,1,B,J,C,K,N) 


Here A, B and C are base addresses for the three vectors involved ina 
vector multiply operation. I, J and K are the address increments 
associated with vectors A, B and C, respectively. N is the element 
count for each of the vectors. A typical call is: 


CALL VMUL (100,1,200,2,300,-1,5) 
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For real vectors where elements are stored in consecutive locations, 


the address increment is l- 


Most Math Library functions, however, 


allow the additional flexibility of specifying arbitrary increments. 
Table 2-1 shows the memory allocations made in the preceding example. 


Table 2-1 CALL VMUL(100,1,200,2,300,-1,5) Memory Allocations 


ADDRESS 


129 


TR EA EE 


191 


EY 


1p2 
193 
194 
195 
299 
201 
2p2 
203 
204 
295 
206 
207 
208 
296 
297 
298 
299 
308 
391 


FPS 860-7288-004 


ELEMENT 


a(1) 
a(2) 
a(3) 
a(4) 
a(5) 
-6(1) 
b(2) 


b(3) 


b(4) 

b(5) 
c(5) = a(5) 
c(4) = a(4) 
c(3) = a(3) 
c(2) = a(2) 
c(1) = a(1) 


16 


* 


* 


* 


b(5) 
b(4) 
b(3) 
b(2) 
b(1) 
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224.2 COMPLEX VECTORS 


For operations involving complex vectors, each complex element occupies 
two consecutive addresses in main data memory. If the complex vector 
is in rectangular form, then the imaginary component immediately 
follows the real. In polar form, the phase (in radians) immediately 
follows the magnitude. 


The base address of a complex vector specifies the address of the first 
real part of the first element. The address increment specifies the 
address interval (difference in address) between one real part and the 
next real part. For complex vectors, this interval must be at least 2. 
The element count refers to the number of complex elements (i-e-, reals 
or imaginaries) to be operated on. For example: 


CALL CVMUL(A,1,B,J,C,K,N,F) 


Here A, B and C are complex vectors with address increments of I, J and 
K, respectively. N is the number of complex elements to be operated on 
for each vector. F is a flag that is set to 1 for a normal complex 
multiply, and to -l if the multiply is to use the complex conjugate of 
vector As The following call is an example of a normal complex 
multiply involving four complex elements: 


CALL CVMUL (100, 2,200,3,300,2,4,1) 


The memory allocations for this example are shown in Table 2-2. 
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Table 2-2 CALL CVMUL(100,2,200,3,300,2,4,1) Memory Allocations 


ADDRESS ELEMENT 
199 , ar(1) 
191 ai(1) 
192 ar(2) 
193 ai(2) 
194 ar(3) 
1p5 ai(3) 
196 ar(4) 
107 ai (4) 
108 oe 
200 br(1) 
291 bi(1) 
202 eS 
293 br(2) 
294 bi(2) 
265 -- 
206 br(3) 
207 bi(3) 
298 -- 
209 br(4) 
219 bi(4) 
300 er(1) 
301 ci(1) 
32 cr(2) 
303 ci(2) 
304 er(3) 
305 ci(3) 
306 cr(4) 
307 ci(4) 
398 =< 


0855 
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2.4.3 RFFT COMPLEX FORM 


A special complex vector form exists for the result of a forward 
real-to-complex FFT using routines RFFT or RFFTIB. For example: 


CALL RFFT(C,N,F) 


Here, if F=l a forward Fast Fourier Transform of a real vector of 
length N is taken. The result is a complex vector with N/2 + 1 complex 
elements; but since two of those complex elements (the first and last) 
have zero imaginary parts, the result can be packed into N locations. 
The following call is an example of an in-place 8-point forward 
real-to-complex Fast Fourier Transform. 


CALL RFFT(100,8,1) 


The memory allocations before and after the transformation are shown in 
Table 2-3. Note that FFT input data must be in consecutive locations. 


FPS 860-7288-004 2 et LO 


Table 2-3 Memory Allocations before and after CALL RFFT (100,8,1) 


ADDRESS ELEMENT 


BEFORE 

19 t(p) 
191 tQ) 
192 Z t(2) 
193 t(3) 
14 t(4) 
125 t(5) 
106 t(6) 
197 t(7) 


AFTER 
ipa fr(Q) 
191 fr(4) 
192 fr(1) 
103 fi(1) 
194 fr(2) 
195 fi(2) 
106 fr(3) 
197 f1(3) 


0856 


Before additional complex operations are performed on the FFT result, 
the complex vector should be unpacked into proper form by moving the 
element fr(4) to location 108 and zeroing locations 101 and 109. 


The inverse complex-to-real FFT operation (RFFT or RFFTB with F=-1) 


expects the complex vector to be in the packed form illustrated in 
Table 2-3. 
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224-4 MATRICES 


Matrices are stored in column order in main data memory. A matrix is 
defined by a base address, an address increment, a row count, and a 
column count. The base address represents the element in the first row 
and column to be operated on. The address increment specifies the 
interval (difference in addresses) between one element of the matrix 
and the next. The row count specifies the number of elements to be 
operated on per column (i.e., the number of rows), while the column 
count specifies the number of columns in the matrix. For example: 


CALL MTRANS (A,I,C,K,M,N) 


Here, M columns and N rows of the matrix with base address A are 
transposed to a matrix whose M rows and N columns are stored starting 
at address C. I and K are the address increments for A and C, 
respectively. The following call transposes a 3-row by 2-column 

' matrix. 


CALL MTRANS (100,1,200,2,2,3) 


The memory allocation for the matrices are shown in Table 2-4. 
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Table 2-4 CALL MTRANS(100,1,200,2,2,3) Memory Allocations 


ADDRESS ELEMENT 

190 a(1,1) 

191 | | a(2,1) 
re ae | | (3.1) 
a ge a a(1,2) 

194 . “a(2,2) 
rr | ~a(3,2) 

196 | ee ee 

200 c(1,1) = a(1,1) 

201 = 

292 c(2,1) = a(1,2) 

293 -- 

204 | ¢(142) = a(2,2) 

205 -- 

206 c(2,2) = a(2,2) 

207 -- 

208 e(1,3) = a(3,1) 

2g9 -- 

210 c(2,3) = a(3,2) 


0857 
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2-4-5 DOUBLE-PRECISION ELEMENTS 

Like complex elements, each double-precision element occupies two 
consecutive addresses in main data memory- The most significant part 
of the element comes first and the least significant part second. Both 
words are stored in normal 38-bit floating-point format with the 
exponent of the second word being 27 less than the exponent of the most 
significant word. 


MOST SIGNIFICANT PART EXPONENT MOST SIGNIFICANT 27 BITS OF THE MANTISSA 
LEAST SIGNIFICANT PART EXPONENT-27 LEAST SIGNIFICANT 27 BITS OF THE MANTISSA 
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Figure 2-3 Double-Precision Element 


FPS 860-7288-004 2 - 23 


2-5 PROGRAM RUN-TIME ENVIRONMENT 


Two factors affect the total time it takes to execute an AP Math 
Library routine called from a FORTRAN program: the AP execution time 
of the individual routine, and the host system overhead. The AP has a 
167ns cycle time during which several operations (add, multiply, fetch, 
move, branch, etc.) can be performed. All of the AP Math Library 
routines have been written to make the most efficient use of this 
parallel structure of the AP and its 167ns basic machine cycle time. 


Prior to the execution of a routine by the AP, the host system must 
load the routine into the AP program source memory (if it has not been 
previously loaded), and must load the parameters into the s-pad 
registers. This time interval -- called the host overhead -- adds to 
the total program run time. Host overhead varies from system to system 
depending on the complexity of the host operating system and the number 
of other operations the host is expected to control along with the AP. 
Host overhead is typically 100 to 1000 microseconds. 


Some knowledge of the host/AP run time environment helps the user 
understand the effect the host overhead has on total program run time. 
This knowledge is also helpful in section 2.7 where techniques are 
given which may permit some reduction of both the AP execution time and 
host overhead. 
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2-6 UNDERSTANDING HOST OVERHEAD 


This section presents information about host overhead. 


2.6.1 THE LOAD MODULE 


Figure 2-5 shows the standard procedure for writing a FORTRAN program, 

compiling it, and linking it with the AP Math Library and user-written 

FORTRAN callable routines. The final load module (refer to Figure 2-6) 
includes: 


e the compiled user-written FORTRAN code 


e the various array processor routines called in the 
program and their AP 64-bit instruction words 


e the AP executive subroutine (APEX), including a table 


which APEX uses to keep track of the contents of the AP 
program source memory 
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HOST RELOCATABLE 
OBJECT LIBRARY 


WRITE FORTRAN PROGRAM WHICH 
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COMPILE IT WITH HOST 
FORTRAN COMPILER 


HOST RELOCATABLE 
OBJECT CODE 


LINK PROGRAM IN NORMAL 
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LOAD MODULE, 
READY FOR 
EXECUTION 


EXECUTE THE PROGRAM 


USER-WRITTEN 
FORTRAN=CALLABLE 

AP ROUTINES 
ASSEMBLED INTO 


~ HOST RELOCATABLE 
OBJECT CODE 


0858 


Figure 2-4 AP/Host FORTRAN Software Connection 
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2.6.2 RUNNING THE FORTRAN PROGRAM 


At run time, the load module which contains the FORTRAN calling 
program, AP Math Library routines, and APEX, is read into the host 
memory, and the host begins executing the FORTRAN program. When a 
routine is called, the host jumps to the routine and executes it. If 
the routine is an AP Math Library routine, a jump to APEX is made. 


APEX is a subroutine that controls the interaction of the host with the 
AP. It handles the loading of the appropriate AP 64-bit instruction 
words into the AP program source memory, the allocation of the program 
source memory locations, the loading of the parameters for the 
routines, and initiates the execution of the instructions by the AP. 


The AP program source memory has a minimum size of 512 wordse It can 
be enlarged in 256 word increments to a maximum of 4096 words. Each 
word in the program source memory is 64 bits long and contains the 
instruction to be executed during one 167ns clock cycle. Once the 
instructions for a routine have been read into the program source 
memory, APEX notes the name of the routine and its location in the 
program source memory and calculates the remaining space available in 
the program source memory. This information is stored in a table in 
the host memory. If a routine is called a second time, APEX does not 
reload the instructions, but merely loads the new parameters and 
initiates execution. If a FORTRAN program uses more AP routines than 
there is space for in the program source memory, APEX overwrites new 
instructions in the program source memory on a last-in, first-out 
basis. 


Once the execution of the routine begins, APEX returns control to the 
user-written FORTRAN calling program. This procedure is repeated each 
time an AP Math Library routine is called. If a routine is called 
before the AP has finished running a previously-called routine, APEX 
waits until the AP has completed the routine before it loads the new 
instructions and/or parameters, and then starts execution of the new 
routine. When data transfers are called for, APEX tells the host when 
to begin according to the wait commands -- APWD, APWR and APWAIT -- in 
the FORTRAN program. 
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2.6.3 RUN TIME AT THE APEX LEVEL 


Figure 2-6 illustrates the sequence of events which occur when a 
FORTRAN program calls an AP routine which has not yet been loaded 
(e-g-, VADD). When the execution of the FORTRAN program gets to CALL 
VADD, the program jumps to the VADD routine. VADD does nothing more 
than call APEX. APEX identifies the calling routine by its return 
addresse This address is then entered in the table in the host memory, 
and is used to determine whether or not the instructions for the 
current call are already resident in the AP program source memory. If 
the routine for the current call is not already resident in the AP, 
APEX cbtains the instructions from the calling routine and transfers 
them to the AP making an appropriate entry in a table. This table 
entry records the starting location in program source memory where the 
instructions have been loaded. APEX also computes the amount of the 
program source memory space that still remains unused and enters this 
number in the table. It uses this number in future calls to determine 
if newly-called instructions must be overlaid in the program source 
memory. 


APEX always tries to load the new instructions in a location that does 
not destroy previously-loaded instructions. If this is not possible, 
previous entries in the table are progressively deleted until there is 
room for the current instructions. The new instructions are then 
overlaid in the newly-aliocated location in the program source memory. 


The actual loading of the AP instructions is accomplished via an 1/0 
operation initiated by APEX, but is actually executed in a device 
handler. 


Once the instructions have been loaded in the AP program source memory, 
APEX obtains the subroutine parameters, transfers them to the AP s-pad 
registers, and triggers execution of the instructions. APEX then 
returns control to the routine which called it; that routine 
immediately returns control to the FORTRAN calling program. 


The time used between the call in the FORTRAN program and the beginning 
of execution of instructions in the AP constitutes the host overhead 
for that call. This host overhead (typically 100 to 1000 microseconds) 
is incurred each time an AP Math Library routine is called. 
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Figure 2-5 Transferring AP Instructions from Host Memory 
to AP Program Source Memory 
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2-7 OPTIMIZING PROGRAM RUN TIME 


A number of items affect the rate at which a FORTRAN program runs on 
the AP. Significant factors are the cycle rate of the main data memory 
and the placement of vectors in the main data memory. Host overhead 
and the timing of data transfer between the host and the AP also have 
an effect on program run time. 


2-7-1 FACTORS AFFECTING AP EXECUTION TIME 


Floating Point Systems, Inc-., offers main data memory for the AP with a. 
choice of two different cycle rates: 167ns or 333ns. This cycle rate 
is the minimum time it takes to access a word of memory following a 
previous access. This minimum time is achieved when consecutive memory 
accesses alternate between even and odd addresses, or between 8K or 32K 
memory banks (depending on the chip type). If consecutive accesses 
specify only even (or only odd) addresses, then the access takes 167ns 
longer for either memory. The machine cycle rate of the AP is 16/ns, 
so the choice of memory can have an effect on how fast the program 
runs. 


If a routine requires the memory to be accessed each machine cycle, the 
routine runs twice as fast with the 167ns memory as it would with the 
333ns memory providing the even-odd address interleaving is maintained. 
If the routine calls for a memory access only every third or every 
fourth machine cycle, the routine runs at the same rate with either 
memory. 


In actual operation, routines in the AP Math Library run anywhere from 
the same rate to twice as fast on the 167ns memory depending on the 
routine. The AP Math Library routines are written differently for the 
two types of memory when necessary to obtain the optimum speed. The 
calling sequence and numerical results are identical in each case. 


FPS 860-7288-004 2 = 30 


2.762 SPECIFYING VECTOR LOCATIONS IN MAIN DATA MEMORY 


Because of the even-odd interleave of main data memory, subroutines run 
at different rates depending on where the vectors are located (i-e.-, 
base address), and also on the address increments associated with each 
vector. 


Three execution times (BEST, TYPICAL, and WORST) are given for each 
memory type in each description of a Math Library routine in Appendix 
E. When operating on real vectors, the TYPICAL time reflects the 
typical situation where all vectors are compactly stored (the address 
increments I, J and K equal 1 or any odd number: -1, 3, 5, etc-), and 
the base addresses are either all even or all odd. 


Sometimes it is possible to achieve faster execution by varying the 
base addresses of vectors between even and odd locations. The 
vector(s) whose base address(es) should be odd when the others are even 
(or even when the others are odd) are indicated in parentheses next to 
the BEST execution time. If no vectors are indicated, the best and 
typical execution times are the same. 


The worst case times involve other even-odd addressing and increment 
combinations. 


Table 2-3 shows the timing for the VADD routine. 
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Table 2-5 VADD Execution Times 


EXECUTION MEMORY TIME/LOOP (us) 


MEMORY 
BEST TYPICAL WORST SETUP (us) 
CS a, Le eaieecennasee 
167 ns 9.5 (B) 9.8 1.9 2.7 


0860 


Note that the execution time is specified on a per-loop basis. Thus, 
if VADD is called to add two 1000-element vectors, the typical 
execution time with a 167ns memory is 1000 x 0.8 (really 0.833us) = 
833us, plus an additional 2.7us of SETUP time needed to initially fill 
the AP pipeline. Thus, the total execution time using 167ns memory is 
836us when all base addresses are even (or all odd). 


Three base address parameters are specified for VADD -- one each for 
the two vectors to be added together, A and B, and one -- C — for the 
location where the result is to be Stored. With the 167ns memory, the 
best run time is obtained when B is an odd address and A and C are even 
(or vice versa -- B is even and A and C are odd). For the example in 
the preceding paragraph, an execution time of 1000x0.5+2.7=502.7us is 
obtained. If, for example, A is 0, address B is 1001, and C is 2002. 
With the 333ns memory, address A must be odd (even) when B and C are 
even (odd) to obtain the fastest execution (e.g., A=0, B=1001, C=2001). 


With complex vector operations, the typical time reflects the most 
likely situation where the increment is even (for compactly stored 
complex vectors the increment is two), and all base addresses are 
either all even or all odd. (An increment of one with a complex vector 
operation produces unpredictable results since each complex element 
requires two#main data memory words.) When faster execution is 
possible with even complex vector increments by adjusting the base 
addresses between even and odd locations, the vector(s) whose base 
address(es) should be odd when the others are even (or even when the 
others are odd) are indicated next to the BEST execution time. 


The times given for matrix operations are for cases where the matrices 
are stored compactly (i.e., the memory increments are one so that the 
matrix elements are stored in consecutive addresses). In some cases, 
the run times are data-dependent, and in other cases they are dependent 
on the sizes of the matrices being operated on. 
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27.3 MINIMIZING THE EFFECT OF HOST OVERHEAD 


There may be certain situations, especially when the host is operating 
in a multi-user, multi-task environment where the host overhead time 
represents a substantial fraction of the total run time involved in 
processing with the AP. The purpose of this section is to suggest some 
techniques for reducing the effect of this overhead if the application 
is time critical. 


e Combine FORTRAN calls to the AP. The most effective 
method of minimizing the effects of host overhead is to 
reduce the number of calls to the AP from the host. 
Often this can be done by careful layout of the 
program. 


Some suggestions: 


Concentrate several vectors in consecutive addresses 
so that several vectors can be transferred with a 
single APPUT call. 


Use AP Math Library routines which replace multiple 
library calls. For example, VMMA performs the same 
operations as two VMUL calls, and a VADD with a 
savings not only of two host calls, but 40 percent 
in AP execution time. 


Overlap the operations of the host and the AP whenever 
possible. 


Since all the AP Math Library routines can be called 
as routines from other AP assembly language programs, 
it is possible to write special FORTRAN callable array 
processing routines which combine a series of calls 

to the AP Math Library.- One FORTRAN call to the 
special routine then replaces the separate calls in 
the host program. Take care that the special purpose 
routine (including all the AP Math Library routines) 
is small enough to fit in the available program 
source memory space. 
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e Load the most used routines first. If the program 
requires more space in the program source memory than 
there is available, the user can minimize some of the 
effects of host overhead by calling the most often used 
routines in the program first. As was stated in the 
discussion of the run-time environment, APEX uses the 
last-in, first-out technique to allocate space for 
routines in the program source memory. If there is no 
room in the program source memory for a new routine, the 
last program words read into the memory are over-written 
for a new routine; the last program words read into the 
memory are over-written until. there is enough space for 
the new routine. Since it requires less host 
overhead to load the parameters of a routine that 
already exists in program source memory than to load 
both the routines and the parameters, it is advantageous 
to call the most often used routines early in the program 
to make sure they are located well down in the program 
source memory.e It may even be useful to call a routine 
before it is needed and give it only a dummy operation 
to do. 


e Other suggestions: 


In single-task host operating systems, APEX 
generally talks directly to the AP; in multi-task 
systems, APEX usually must ask the host system 
for permission to talk to the AP. This may take 
as long as 1 ms. In such a situation, consider 
the possibility of switching to a simpler host 
operating system. 


Operating on large arrays is more efficient than 
operating on small arrays. 


Consider overlapping data transfer and processing. 
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2.74 OVERLAPPING DATA TRANSFER AND PROCESSING 


The AP’s highly parallel operation allows the user to write programs 
that process and transfer data simultaneously. This type of 
programming involves leaving out some of the program synchronization 
commands -- APWR, APWD and APWAIT. Leaving out wait commands, however, 
presents the potential problem of asynchronous operation between the AP 
and the host. Thus, there is a chance that results are processed 
before they are actually present in their assigned location in main 
data memory. The wait commands are provided to avoid such problems. 


Programming involving simultaneous processing and data transfers is 
available at the FORTRAN level as well as the AP machine language 
level. The advantage is that it can speed up program run time when 
used without loss of synchronization. 


The success of this type of programming depends on host overhead, or in 


other words, how dedicated the host is to servicing the needs of the 
AP. 


2.7.5 WRITING AP ASSEMBLY LANGUAGE PROGRAMS 


An AP assembly language routine can be made FORTRAN callable by 
including the following pseudo-operation in the routine: 


SENTRY name,p 


where "name" is the FORTRAN name for the routine, and "p" indicates the 
number (maximum of 16) of parameters in the FORTRAN call. At run time, 
APEX transfers these parameters to s-pad registers 0 through p-l- 
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Thus, the pseudo-operation for the AP using FORTRAN command CALL 
VUSER(A,1I,B,J,C,K,N) is SENTRY VUSER,7. At run time, APEX transfers 
the seven parameters as shown in Table 2-6. 


Table 2-6 Parameter Transfer 


S-PAD REGISTER CONTENTS 


A 
1 iy 
2 B 
3 J 
4 C 
5 K 
6 N 


0861 


Figure 2-7 illustrates the procedure for creating a FORTRAN callable AP 
assembly language routine. 


All the AP Math Library routines have been written in AP assembly 


languagee The user can learn more about this language through the 
manuals avaialble from Floating Point Systems, Inc. 
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WRITE AP LANGUAGE SOURCE 
CODE FOR ROUTINE 


ASSEMBLE IT USING APAL 


AP RELOCATABLE 
OBJECT CODE 


USE APLINK TO LINK OBJECT CODE WITH OBJECT CODE FOR 
ANY AP SUBROUTINES NEEDED BY USER ROUTINE. IF ROUTINE 
TO BE USED WITH AP SIMULATOR (APSIM) CONCLUDE APLINK 
WITH /E COMMAND; IF TO BE USED ON AP CONCLUDE WITH 
/A COMMAND. 


AP LOAD MODULE 
(FORTRAN CODE) 


SIMULATOR AP 


COMPILE LOAD MODULE 
RUN LOAD MODULE WITH HOST FORTRAN 


ON SIMULATOR APSIM COMPILER 


HOST RELOCATABLE 
OBJECT CODE 


OBJECT CODE AVAILABLE 
FOR USE BY HOST LINKER 
(SEE FIGURE 2-3) 


0862 


Figure 2-6 Procedure for Creating User-Written FORTRAN Callable 
AP Assembly Language Routines 
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CHAPTER 3 


DESCRIPTION OF AP MATH LIBRARY ROUTINES 


3.1 INTRODUCTION 


This chapter describes the categories of routines that are contained in 
the AP Math Library. 


3.2 GENERAL INFORMATION ABOUT ROUTINES 


The AP Math Library is divided into 12 categories: 


data transfer and control operations 
basic vector arithmetic 
vector-to-scalar operations 

vector comparison operations 

complex vector arithmetic 

data formatting operations 

matrix operations 

FFT operations 

auxiliary operations 

signal processing operations (optional) 
table memory operations (optional) 
APAL-callable utility operations 


3.2.1 DATA TRANSFER AND CONTROL OPERATIONS 


These commands control the data transfer and program synchronization 
between the AP and the host. They are actually part of APEX, the AP 
executive, and thus require no space in the AP program source memory. 
The execution time for these routines depends on the speed of the host 
system. Refer to sections 2.3.5 through 2.3.8 for more information 
about these calls. 


FPS 860-7288-004 fle: 


3.2.2 BASIC VECTOR ARITHMETIC 


This group includes routines to perform basic real vector arithmetic 
operations such as vector add (VADD), subtract (VSUB), multiply (VMUL), 
and divide (VDIV). Also included are trigonometric functions, 
logarithms, simple logical operations, and vector generation (e.g., 
constants, ramps, random numbers). The vectors operated on must 
conform to the real vector format shown in section 2.4.1. 


3.2.3 VECTOR-TO-SCALAR OPERATIONS 


These real vector operations determine global characteristics of a 
vector. They determine a single value that characterizes one facet of 
the vector: sum of all the elements (SVE) or value of the largest 
element (MAXV), etc. 


3.2.4 VECTOR COMPARISON OPERATIONS 


These real vector operations perform compare and replace operations. 
They create a third vector based on the comparison of two vectors. 
VMAX, for example, sets the elements of a third vector equal to the 
larger of each pair of corresponding elements in two vectors. 


3.2.5 COMPLEX VECTOR ARITHMETIC 


All the commands in this group operate on complex vectors or 
combinations of real and complex vectors. The complex vectors must 
conform to the complex vector format described in section 2.4.2. In 
general, the increment parameter used in a complex vector operation 
must always be two or greater when specifying a complex vector. A 
complex element is made up of two parts -- a real part and an imaginary 
part stored in consecutive words in the AP main data memory. The 
parameter N in a complex vector routine always refers to the number of 
complex elements (pairs). 


When the operation involves both real and complex vectors, the TYPICAL 
execution times are obtained when the increments of the real vectors 
are odd and the increments of the complex vectors are even, and the 
base addresses of all the vectors are either all even or all odd. 


Note that some complex vector operations can be done with real vector 


routines. CALL VCLR (0, 1, 1000), for example, clears a complex vector 
of 500 complex elements that begins at location 0. 
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3.2.6 DATA FORMATTING OPERATIONS 


The 38-bit AP floating-point format is illustrated in 
0, 1, and 40 are memory parity bits. 


3 11 12 13 


Figure 3-1 AP Floating-Point Format 


EXPONENT MANTISSA 


Figure 3-1. Bits 


9865 


The data formatting operations provide a number of normalizing and 
integer-to-floating-point conversion routines to operate on data stored 


in the AP main data memory. 


VFLT normalizes a vector that has been transferred to 
APPUT command, using TYPE 1 format-conversion (16-bit 
to unnormalized 38-bit floating-point format). Refer 
of APPUT beginning with section 2.3.5. Normalization 
the most significant bit of the mantissa to bit 13 of 
with appropriate changes in the exponent. 


VFIX converts a normalized 38-bit floating-point word 


the AP with an 
integer converted 
to the discussion 
means shifting 
the 38-bit word 


into a 16-bit, 


2’s complement integer residing in the lower 16 bits (bits 24 to 39) of 
the 38-bit word. This operation is used prior to transferring data 
with the APGET command when using TYPE 1 format conversion (see the 
discussion of APGET beginning with section 2.3.8). VSCALE, VSCSCL and 


VSHFX are variations of VFIX. 


For convenience, there are also a number of integer unpacking and 
conversion operations for use with the TYPE 0 format-conversion in 
APPUT and APGET. For example, VUP16 converts two 16-bit integers 


packed in the lower 32 bits of a 38-bit word into two 


38-bit, 


normalized floating-point words. VUP8 converts four 8~bit integers 
stored in the lower 32 bits of a 38-bit word into four 38-bit 
normalized floating-point words. VPK16 and VPK8 perform the reverse 
operation: two 38-bit floating-point words are converted into two 
16-bit integers; four floating-point words are converted into four 


8-bit integers in a single 38-bit word. 
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322.7 MATRIX OPERATIONS 


The matrix routines perform typical matrix operations such as 
multiplication (MMUL), transposition (MTRANS), and inversion (MATINV). 
A matrix is always stored in the AP main data memory as a sequence of 
columns (see section 2.4.4 for a discussion of the AP matrix format). 
The M and N notation used in the matrix operation parameters refers to 
the number of rows (M) and the number of columns (N) in a matrix. For 
example, an operation on a matrix starting at address C involves MC 
rows and NC columns. 


Timing information is given with each matrix routine for representative 
matrix sizes. An address increment of one (i-e-, a compactly stored 
matrix) is assumed for all the times given. 


3.2.8 FFT OPERATIONS 


The FFT commands perform Fast Fourier Transforms on both real and 
complex vectors. Each FFT routine performs both the forward transform 
(time-to-frequency) and the inverse transform (frequency-to-time), 
depending on the parameter F (+1 for forward, -1 for inverse). There 
are two categories of FFT routines: in-place and not-in-place. The 
in-place routines (RFFT and CFFT) transform the time elements from N 
locations in main data memory and store the resultant complex frequency > 
elements in the same locations in the main data memory. If the main 
data memory is 8192 words, the user can perform an FFT on N = 8192 real 
points, or N = 4096 complex points. The not-in-place FFT routines 
(RFFTB and CFFTB) run somewhat faster than the in-place routines, but 
require separate locations in main data memory for the time points and 
the resultant frequency points. 


When transforming real time elements into complex frequency elements, a 
special method of packing the complex frequency elements is used. A 
FFT of N real time points actually produces N/2 + 1 complex frequency 
elements. Since a complex element consists of a real and an imaginary 
part, N + 2 words are thus required to store N/2 + 1 complex elements. 
It is known, however, that the I(0) and the I(N/2) frequency points are 
always 0. Therefore, when performing a real-to-complex FFT with the 
RFFT or RFFTB commands, the R(N/2) frequency point is stored in the 
I(O) memory location, and the I(N/2) frequency point (always 0) is 
dropped. The results of a N-point real FFT can thus be stored in N 
words. An 8-point real-to-complex FFT, for example, is packed as shown 
in Table 3-1. 
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The RFFTSC routine is provided to allow unpacking of the complex RFFT 
vector and scaling of the data. Two types of unpacking are provided. 
In Type I (refer to Table 3-1), the I(0) location in memory (which now 
holds the R(N/2) data point) is cleared to zero and the R(N/2) value is 
discarded. The value of R(N/2) is often considered unimportant since 
it represents the frequency component at the Nyquist frequency. Type I 
unpacking would be used when performing in-place transforms where all 
the available main data locations are being used. In Type II 
unpacking, the R(N/2) value is moved from the 1(0) location to its 
proper R(N/2) location, and the 1(0) and I(N/2) memory locations are 
cleared to zero. Thus, in Type II unpacking, all the complex data 
points are retained. The complex RFFT format is used for both the 
in-place and not-in-place real-to-complex transforms. For 
complex-to-real inverse FFTs, the complex elements must be repacked 
into the complex RFFT format. RFFTISC also handles the repacking 
procedure. 


Table 3-1 Real-to-Complex FFT Vector Format 


ADDRESS TIME POINTS TYPE IT UNPACKING — 
a i a 
ee 
se a 
SS CA NE TI EC 
aS OS CN NTC 
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The complex-to-complex FFT routines, CFFT and CFFTB, require no such 
packing or unpacking since all operations are performed on properly 
formatted complex vectors. A FFT of N complex time elements produces N 
complex frequency elements. 


The data obtained either from a forward RFFT or a forward CFFT requires 
rescaling. Table 3-2 shows the multiplying factors for each transform 
to get back to the original scale. The RFFTSC and CFFTSC routines, 
respectively, provide parameters for scaling the results. Note that no 
scaling is required for the inverse transforms. 


Table 3-2 Multiplying Factors for Scaling FFT Results 


ROUTINES FORWARD INVERSE 


CFFT, CFFTB 
RFFT, RFFTB 


1/N 


0864 


3-2-9 AUXILIARY OPERATIONS 


The commands in the auxiliary operations group perform miscellaneous 
operations such as numerical integration, evaluation of polynomials, 
and convolutions. 


3.2.10 SIGNAL PROCESSING OPERATIONS (OPTIONAL) 


The signal processing operations consist of a collection of real and 
complex routines which are often used in conjunction with the FFT 
routines. They perform many widely used time series analysis 
calculations such as auto-spectrum (ASPEC), cross-spectrum (CSPEC), 
coherence function (COHER) and histogram (HIST). 
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3.2.11 TABLE MEMORY OPERATIONS (OPTIONAL) 


These subroutines are for use with the optional writable table memory 
(TMRAM). This table memory is used in conjunction with the standard 
read only table memory in the AP. 


The TMRAM allows the user to either create a table of his own special 
purpose constants, or use the additional memory space as an adjunct to 
the main data memory. The TMRAM has a 167ns memory access time. When 
used in conjunction with the main data memory, it can speed up basic 
vector arithmetic operations such as add, subtract, multiply and move. 
For example, MTTADD adds one vector from the main data memory to a 
vector from the TMRAM, and stores the resultant vector in the TMRAM. 
Since the AP can access a word in the main data memory and a word in 
the TMRAM in the same machine cycle, one machine cycle is saved in the 
calculation. 


The nomenclature used in these calls refers to the main data memory (M) 


and the TMRAM (T). In the MMTADD(A, I, B, J, C, K, N) command, for 
example, A and B are base addresses in main data memory (M), and C is 
base address in the TMRAM (T). The addresses in the table memory are 
numbered consecutively from 0 as in the main data memory. 


3.2.12 APAL-CALLABLE UTILITY OPERATIONS 


These routines are called by many of the FORTRAN callable routines in 
the AP Math Library (refer to the category EXTERNALS below the dotted 
line in the routine descriptions). They are callable from programs 
written in APAL, but are not callable from FORTRAN programs. This 
miscellaneous assortment of routines includes scalar functions such as 
sine, cosine and square root, several routines called in the FFT 
operations, and double-precision scalar functions. All pertinent 
information is given for each routine except for the FORTRAN CALL, 
PARAMETERS and EXAMPLE. The execution times given are generally the 
total executing time since most of the routines are non-repetitive. 
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CHAPTER 4 


PROGRAMMING EXAMPLES 


4.1 INTRODUCTION 


This chapter contains four examples illustrating the use of the AP Math 
Library routines in FORTRAN programs. The first two examples show the 
replacement of FORTRAN arithmetic DO loops with FORTRAN code which 
perform equivalent processing in the AP- The last two examples 
illustrate programs which perform double-tapered convolution operations 
in the AP -- one using time-domain techniques, the other using 
frequency-domain techniques. 
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4.1.1 EXAMPLE 1: A BENCHMARK PROGRAM, INCLUDING AP MEMORY MAP 


This section lists a benchmark program which includes an AP memory map. 


CkeRKKKKK EXAMPLE 1 = BENCHMARK PROGRAM *X*#XARKAAKKAKKKKARK 
C 


sso ScSc SS SSS SSS SS FS SL SS SS SST SS SSS SSS = SSS SSS SS LS LS ST VL FS TS LVS SV TVS LSS ISS SS TS SS = 


ORIGINAL FORTRAN 


¢ 

C 

C SUBROUTINE EX1(SCLR,AM,V, VX, VY,X,X0,Y,Y0,Z,N) 

Cc DIMENSION AM(N),V(N),VX(N),VY(N),X(N),XO(N),Y(N),YO(N),Z(N) 
C DO 1 I=1,N . 

Cc VX(L)=SCLR * (X(I)-X0(1)) 

C VY(I)=SCLR * (Y(1)-YO(I)) 

Cc V(L)=SQRT (VX (I) **2 + VY (1) **2) 

GC; 2 Z(L)=AM(L) * (X(L)*VY(L) - Y¥(L)¥*VX(T)) 

Cc RETURN 

C END 

C 
C 


sec rss Ss sss SSS Vs SS SSS LS SS SS SL SS SS SS SSS SS SS SSS <scocSS SSS SS SSS SS SS SS TS TS 


SUBROUTINE EX1(SCLR,AM,V, VX, VY,X,X0,Y,Y0,Z,N) 

DIMENSION AM(N),V(N),VX(N),VY(N),X(N),XO(N),Y(N),YO(N),Z(N) 
C----ALLOCATE AP MEMORY (SEE MEMORY MAP AT END OF PROGRAM) 

ISCLR=0 

TAM=LSCLR+1 

IV=IAM+N 

IVX=I VAN 

IVY =I VAN 

IX=LVY+N 

IXO=1 X+N 

LY =1 XO-+N 

LYO=LY+N 

IZ=LYO+N 
C---~-INITIALIZE AP 
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CALL 
C----PUT OUT 
CALL 
CALL 
CALL 
CALL 
CALL 
CALL 
CALL 


11.9 MS 


AAAAANG 


CALL 
CALL 
CALL 
CALL 
CALL 
CALL 
CALL 
CALL 
CALL 


APCLR 

DATA TO AP 
APPUT (AM, IAM,N, 2) 
APPUT (X, IX,N, 2) 
APPUT (XO, IX0,N, 2) 
APPUT (Y, IY,N, 2) 
APPUT (YO, IY0,N, 2) 
APPUT (SCLR, ISCLR, 1, 2) 
APWD 


----DO THE COMPUTATION 


AP COMPUTATION TIME FOR N=1000 IS 8.2 MS FOR 167 NS MEMORY, 


FOR 333 NS MEMORY, EXCLUSIVE OF HOST SYSTEM OVERHEAD 


VSUB (IX, 1, 1X0, 1, IVX,1,N) 
VSUB(LY,1,1Y0,1,IVY,1,N) 

VSMUL (IVX, 1, ISCLR, IVX, 1,N) 

VSMUL (IVY, 1, ISCLR, IVY,1,N) 

VMMA (IVX, 1, I1VX,1,IVY,1,I1VY,1,IV,1,N) 
VSQRT (IV, 1,1V,1,N) 

VMMSB (IX, 1, 1VY,1, IY, 1, 1VX,1,1Z,1,N) 
VMUL (IAM, 1,1Z,1,1Z,1,N) 

APWR 


C----GET RESULTS FROM AP 


CALL 
CALL 
CALL 
CALL 
CALL 


APGET (VX, IVX,N,2) 
APGET (VY, IVY,N, 2) 
APGET (V,IV,N,2) 
APGET (Z,1Z,N,2) 
APWD 


RETURN 


END 
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4.1.2 EXAMPLE 2: A GEOPOTENTIAL CALCULATION PROGRAM 


This section lists a geopotential calculation program. 


CxeaKKAAK EXAMPLE 2 = GEOPOTENTIAL CALCULATION **#XA*xxAAKKAKK 


ORIGINAL FORTRAN 


SUBROUTINE EX2 
COMMON /B/PHIB(100,10),HB(100,10),PKB(100),DS12 
DO 1 J=1,9 
pO 1 I=1,100 
1 PHIB(L,J+1)=PHIB(1I,J)+DS12*PKB(L)*(HB(L,J+1)+HB(I,J)) 
RETURN 


QAAARQNDANAANRAAARAAANANMNA 


SUBROUTINE EX2 
COMMON /B/PHIB(100,10),HB(100,10),PKB(100),DS12 
C----- AP MEMORY LAYOUT 
IDS12=0 
IPKB=1 
THB=IPKB+100 
IPHIB=IHB+1000 


C----- INITIALIZE THE AP 
CALL APCLR 
C---—= PUT OUT THE DATA TO AP 
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CALL APPUT (PHIB, IPHIB, 1000, 2) 
CALL APPUT (HB, IHB, 1000, 2) 
CALL APPUT (PKB, IPKB, 100, 2) 
CALL APPUT(DS12,1DS12,1,2) 


CALL APWD 
C 
G--2+— DO THE COMPUTATION 
C 
e AP COMPUTATION TIME IS 2.3 MS FOR 167 NS MEMORY, 3.7 MS FOR 
C 333 NS MEMORY, EXCLUSIVE OF HOST SYSTEM OVERHEAD 
C 
CALL VSMUL (IPKB,1,IDS12, IPKB, 1,100) 
CALL VADD(IHB+100,1,IHB,1,IHB, 1,900) 
JHB=1HB 
DO 1 J=1,9 
CALL VMUL (IPKB,1,JHB,1, JHB, 1,100) 
1 JHB=JHB+100 
CALL VADD(IPHIB, 1, IHB,1,IPHIB+100,1, 900) 
CALL APWR 
C-----GET THE RESULTS FROM AP 
CALL APGET (PHIB(1,2),IPHIB+100, 900, 2) 
CALL APWD 
RETURN 
END 
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4.1.3 EXAMPLE 3: A TIME-DOMAIN CONVOLUTION PROGRAM 


This section lists a time-domain convolution program. 


CxekRAAKKA EXAMPLE 3 = TIME DOMAIN CONVOLUTION ****4xxkKRAKKAAK 

C 
SUBROUTINE TCONV (TRACE, FILTER, RESULT, NTRACE,NFILT,NRESLT ) 
INTEGER NTRACE,NFILT,NRESLT 
REAL TRACE (NTRACE) , FILTER (NFILT ) ,RESULT (NRESLT ) 


DOES A TIME DOMAIN CONVOLUTION OF “TRACE” WITH “FILTER’, 
PRODUCING “RESULT’. 

A DOUBLE TAPERED CONVOLUTION IS DONE BY PADDING THE SUPPLIED 
TRACE WITH BOTH LEADING AND TRAILING ZEROS IN THE AP-120B. 


—---~---PARAMETERS : 


TRACE INPUT DATA TRACE 

FILTER INPUT FILTER 

RESULT - OUTPUT RESULT 

NTRACE - NUMBER OF TRACE POINTS 

NFILT - NUMBER OF FILTER POINTS 

NRESLT - NUMBER OF RESULT POINTS, MUST EQUAL 
NTRACE+NFILT=1 [20!!ie!! 


NOTE: THE RESULT MAY BE STORED IN THE HOST ON TOP OF EITHER THE 
DATA OR THE FILTER 


nee ROUTINES USED: APPUT,APGET,APCLR,APWR,APWD,VCLR, CONV 


aAAANANMRANAAANAANAANAANAANRAAARANANAAAAAANN 


LOCAL STORAGE 
INTEGER IFMT,NPAD,ITRACE, IFILT 


------- METHOD: 


FOR EXAMPLE, NTRACE=5, NFILT=3: 
THEN: 
NRESLT=7 
' ITRACE=0 
IRESLT=0 
IFILT=9 
NPAD=2 


QAaAaaAgagaaAangnrdgaanana 
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C AP MEMORY LAYOUT: 
G 
Cc LOC 
C 0 0 <--— ITRACE <--- IRESLT 
Cc 1 0 ‘ : 
C 2 TRACE PT #1 : 7 
Cc 3 wt A 2 ¢ ’ 
C 4 LAj " 3 e ? 
Cc 5 ba hj 4 eo ? 
Cc 6 w" wT 5 cd id 
C 7 0 : : 
Cc 8 0 . 
C 9 FILTER PT #1 <--= IFILT 
Cc 10 La t 2 ca 
Cc ll | mW 3 ? 
C 
C 
LFMT=2 /*FORMAT 2 FOR FLOATING POINT 
C-----INITIALIZE AP 
CALL APCLR 
C-----ALLOCATE AP MEMORY 
NPAD=NFILT-1 /*NUMBER OF ZERO PADS 
ITRACE=0 /*TRACE LOCATION IN THE AP 
LF LLT=LTRACE+NTRACE+N PAD *2 /*FILTER LOCATION IN THE AP 
C-----TRANSFER DATA TO AP 
CALL APPUT (TRACE, ITRACE+NPAD , NTRACE, IFMT) /*PUT THE TRACE 
CALL APPUT (FILTER, IFILT,NFILT, IFMT) /*PUT THE FILTER 
CALL APWD 
C----=- DO THE COMPUTATION 
CALL VCLR(ITRACE,1,NPAD) /*FRONT ZERO PAD 
CALL VCLR (ITRACE+NRESLT, 1,NPAD) /*BACK ZERO PAD 
C---=--DO IT ; 
CALL CONV(ITRACE, 1, IFILT+NFILT-1,-1, ITRACE,1,NRESLT,NFILT) 
CALL APWR 
C-----TRANSFER RESULTS FROM AP 
CALL APGET (RESULT, ITRACE, NRESLT, IFMT) /*GET RESULTS 
CALL APWD 
RETURN 
END 
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4.1.4 EXAMPLE 4: A FREQUENCY-DOMAIN CONVOLUTION PROGRAM 


This section lists a frequency-domain convolution program. 


CaRKKKAKK EXAMPLE 4 = FREQUENCY DOMAIN CONVOLUTION *#&*eexx 4x 


SUBROUTINE FCONV (TRACE, FILTER, RESULT,NTRACE,NFILT, NRESLT ) 
INTEGER NTRACE,NFILT,NRESLT 
REAL TRACE (NTRACE) , FILTER (NFILT) ,RESULT (NRESLT ) 


C 

C DOES A FREQUENCY DOMAIN CONVOLUTION OF “TRACE” WITH “FILTER”, 
C PRODUCING “RESULT”. A DOUBLE TAPERED CONVOLUTION IS DONE. 

C 


Com=--== PARAMETERS : 


TRACE - INPUT DATA TRACE 

FILTER - INPUT FILTER 

RESULT = OUTPUT RESULT 

NTRACE -— NUMBER OF TRACE POINTS (MUST BE A POWER OF 2) 
NFILT - NUMBER OF FILTER POINTS 

(MUST BE A POWER OF 2 <= NTRACE) 

NRESLT - NUMBER OF RESULT POINTS (MUST EQUAL NTRACE) 


NOTE: THE RESULT MAY BE STORED IN THE HOST ON TOP OF EITHER THE 
DATA OR THE FILTER 


~------ROUTINES USED: APPUT,APGET,APCLR,APWR,APWD,VCLR,RFFT,VMUL, 
CVMUL, RFFTSC 


aganananagngnaaanananaaqaaaaana 


LOCAL STORAGE 
INTEGER LFMT,NFFT, ITRACE, IFILT 
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LFMT=2 /*FORMAT 2 FOR FLOATING POINT 
C----- INITIALIZE AP 
CALL APCLR 
C-----ALLOCATE AP MEMORY 
NFFT=NTRACE*2 /*FFT SIZE 
ITRACE=0 /*LOCATION OF TRACE IN AP 
IF ILT=ILTRACE+NFFT /*LOCATION OF FILTER IN AP 
C-----TRANSFER DATA TO AP 
CALL APPUT (TRACE, ITRACE, NTRACE, IFMT) /*PUT TRACE 
CALL APPUT (FILTER, IFILT,NFILT, IFMT) /*PUT FILTER 
CALL APWD 
C-----DO THE COMPUTATION 
CALL VCLR (ITRACE+NTRACE, 1,NFFT-NTRACE ) /*PAD TRACE 
CALL VCLR(IFILT+NFILT,1,NFFT-NFILT) /*PAD FILTER 
CALL RFFT(ITRACE,NFFT,1) /*FORWARD FFT TRACE 
CALL RFFT(IFILT,NFFT,1) /*FORWARD FFT FILTER 
CALL VMUL(ITRACE,1,IFILT,1,ITRACE,1, 2) /*CROSS MUL IST 2 
C----- DO REST 
CALL CVMUL (ITRACE+2, 2, [FILT+2, 2, ITRACE+2,2,NTRACE-1, 1) 
CALL RFFTSC(ITRACE,NFFT,0,-1) /*SCALE RESULTS BY 1/(44NFFT) 
CALL RFFT (ITRACE,NFFT,-1) /*INVERSE FFT 
CALL APWR 
C---—-TRANSFER RESULTS FROM AP 
CALL APGET (RESULT, ITRACE, NFFT, IFMT) /*GET RESULTS 
CALL APWD 
RETURN 
END 
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CHAPTER 5 


FORTRAN MATH LIBRARY SIMULATOR (MATHSIM) 


5-1 INTRODUCTION 


The FORTRAN Math Library Simulator (MATHSM) is comprised of a series of 
FORTRAN subroutines. These subroutines simulate the AP Math Library 
routines and APEX routines which control data flow to and from the AP 
(DMA), process data in the AP, and synchronize host/AP operations. 
MATHSIM allows the user to check FORTRAN programs which make numerous 
calls to the Math Library and APEX without use of the AP. It estimates 
various host/AP program execution times such as data flow time, AP 
program execution time, and host overhead time. MATHSIM allows 
detection of possible host/AP synchronization errors. Adjustment of a 
few program parameters enables MATHSIM to closely simulate all of the 
many host/AP systems, thus allowing the user to predict the effects of 
possible system modifications upon execution. 


MATHSIM provides the basic features outlined in the following sections. 


5-1-1 MATH LIBRARY ROUTINES 


MATHSIM provides an equivalent FORTRAN routine for each Math Library 
routine simulated. Each routine has a calling sequence identical to 
that used by the actual Math Library routine. No changes in the user’s 
FORTRAN program are necessary. The user simply compiles the calling 
program, links to the simulated Math Library, and runs the program as 
if the AP were present. 


51.2 DMA 


MATHSIM simulates the flow of data in and out of the AP by calls to the 
APEX routines APPUT and APGET. This is done just as if the AP were 
present. The simulator also simulates program source loading and 
Management via the subroutine APEX. 
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51.3 HOST/AP SYNCHRONIZATION 


Since both loading and executing the AP are simulated, MATHSIM also 
simulates the synchronization by calls to the APEX subroutines APWD, 
APWR, and APWAIT. When these calls are omitted, a synchronization 
warning is tallied, but execution continues. 


5-144 TIMING ESTIMATES 
MATHSIM estimates three of the system times: AP program execution 


time, DMA time, and AP executive (host overhead) time. The simulator 
does not account for any overlapping of these functions. 


5-2 DETAILED DESCRIPTION 


This section presents a detailed description of the various features of 
MATHS IM. 


5.2.1 MATHSIM ROUTINES 


MATHSIM contains most of the FORTRAN callable routines in the basic AP 
Math Library and the Signal Processing Library.e The APEX routines 
described in this manual are included in MATHSIM. The routines 
supported by MATHSIM are listed in Table 5-1 and 5-2. 
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Table 


NAME 


APASGN 


APCHK 


APCLR 


APEX 


APGET 


APGSP 


APINIT 


APPUT 


APRLSE 


APSTAT 


APSTOP 


APWAIT 


APWD 


APWR 


APXCLR 


APXSET 


860-7288-004 


5-1 


MATHSIM APEX Routines 


OPERATION 


Assign AP 

Check AP program error condition 

Initialize the AP 

Program source executive 

Get data from the AP 

Read an AP s-pad register 

To pee an AP and ais APEX 
| Put data into the AP 

Release AP a 

Get AP hardware status 

Pause on AP fatal error 

Wait for AP 

Wait for OMA and error check 

Wait for AP run complete and error ree 

Clear APEX tables i 

Initialize APEX and reset Ap 


Find address of variable 


Q866 


Table 5=2 
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NAME 


ACORF 


ACORT 


ASPEC 


CCORF 
CCORT 
COOTPR 
crFT 
CFFTB 
CFFTSC 
COHER 
CONV 


CRVADD 


CRVDIV 


CRVMUL 


CRVSUB 


CSPEC 
CTRN3 
CVADD 
evcons 
CVCONJ 
CVEXP 
CVFILL 


CVMA 


MATHSIM Math Library Routines 


OPERATION 


Auto-correlation (frequency-domain) 


Auto-correlation (time-domain) 


Accumulating auto-spectrum 


Cross-correlation (frequency-domain) 


Cross-correlation (time-domain) 


Complex vector dot product 


Complex to comple FFT (inplace) 


Complex to complex FFT (not in place) 


| 


Complex FFT scale 


Coherence function 


Convolution (correlation) 


Complex and real vector add 


Complex and real vector divide 


Complex and real vector multiply 


Complex and real vector subtract 


Accumulating cross-spectrum 


3-Dimension coordinate transformation 


Complex vector add 


Complex vector combine 


Complex vector conjugate 


Complex exponential 


Complex vector fill 


Comptex vector multiply and add . 


0867 


Table 5-2 MATHSIM Math Library Routines (cont.) 


NAME OPERATION 
CVMAGS Complex vector magnitude squared 
CVMEXP Vector multiply complex exponential 
CVMOV | Complex vector move 
CYMUL Complex vector multiply 
CVNEG Complex vector negate 
CVRCIP Complex vector reciprocal 
CVREAL Form complex vector of reals 
CVSMUL ~ Complex yeetae scalar multiply 
CVSUB Complex vector subtract 7 
DEQ22 Difference equation, 2 poles, 2 zeros 
DOTPR Dot product 
FMMM nena matrix maiteinny 
FMMM3 2 Fast memory matrix mult (dim 32 or eee} 
HANN Hanning window multiply 
HIST Histogram 7 
LVEG Logical vector equal 
LVGE Logical vector greater or equal 
LVGT Logical vector greater than 
LVNE Logical vector not equal 
LYNOT Logical vector not 
MATINV Matrix inverse 
MAXMGV Maximum magnitude element in vector 
MAXV Maximum element in vector 


0868 
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Table 5-2 MATHSIM Math Library Routines (cont.) 


NAME OPERATION 
MEAMGV Mean of vector element magnitudes 
MEANV Mean value of vector elements 
MEASQV Mean of vector element squares 
MINMGV Minimum magnitude element in vector 
MINV Minimum element in vector 
MMUL Matrix multiply 
MMUL32 Matrix multiply (dim 32 or less) 
MTHSIM FORTRAN simulation of APMATH 
MTRANS Matrix transpose 
MVML3 Matrix vector multiply (3x3) 
MVML 4 Matrix vector multiply (4x4) 
POLAR Rectangular to polar conversion 
RECT Polar to rectangular conversion 
RFFT Real to complex FFT (in place) 
RFFTB Real to complex FFT (not in place) 
RFFTSC Read FFT scale and format 
RMSQY Root-mean-square of vector Sienenee 
SCJMA Self-conjugate multiply and add 
SOLVEQ Linear equation solver 
SVE Sum of vector elements 
SVEMG Sum of vector element magnitudes 
SVESQ Sum cf vector element squares 

SVS Sum of vector signed squares 


0869 
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Table 


5~2 


NAME 


TCONV 
TRANS 
VAAM 
VABS 
VADD 
VALOG 
VAM 
ror 
YATN2 
VAVEXP 


VAVLIN 


MATHSIM Math Library Routines (cont.) 


OPERATION 


Posttapered conyolution (correlation) 
Transfer function 

Vector are add, and multiply 

Vector absolute value 

Vector add 

Vector antilogarithm (base 1@) 
Vector add and multiply 

Vector arctangent 

Vector sentunnen of y/x 

Vector exponential averaging 


Vector linear averaging 
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VCLIP 


VCLR 


YCOS 


YDBPWR 


VOIV 


VEXP 


VFILL 


VFIX 


VFLT 


VFRAC 


VICLIP 


VIMAG 


Vector clip 

Vector clear 

Vector cosine 

Vector conversion to DB (power) 
Vector divide 

Vector exponential 

Vector fill 

Vector integer fix 

Vector integer float 

Vector truncate to fraction 
Vector inverted clip 


Extract imaginaries of complex vector 


Q870 


Table 5-2 MATHSIM Math Library Routines (cont.) 


NAME 


VINDEX 
YINT 
YLIM 
VLMERG 
VLN 
VLOG 
YMA 
VMAX 
VMAXMG 
VMIN 
YMINMG 
VMMA 
VMMSB 
YMOV 
VMSA 
VMSB 
VMUL 
VNEG 
VPOLY 
VRAMP 
VRAND 
VREAL 


VSADD 
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OPERATION 


Yector index 


Vector truncate to integer 


Vector limit 


Vector logical merge 


Vector natural logarithm 


Vector logarithm (base 19) 


Vector multiply and add 


Vector maximum 


Vector maximum magnitude 


Vector minimum 


Vector minimum magnitude 


Vector multiply, multiply, and add 


Vector multiply, multiply, and subtract 


Vector move 


Vector multiply and scalar add 


Vector multiply and subtract 


Vector multiply 


Vector negate 


Vector polynomial 


Vector ramp 


Vector random numbers 


Vector reals of complex vector 


Vector scalar add 


0871 


Table 5-2 MATHSIM Math Library Routines (conte) 


NAME OPERATION 
VSBM Vector subtract and multiply 
VSBSBM Vector subtract, eaeuenee: and multioly 
VSCALE “Vector scale (power 2) and fix 
VSCSCL Yector scan, scale (power 2) and fix 
VSHFX Vector shift and fix 
VSIMPS | Vector Simpsons 1/3 rule integration 
VSIN Vector sine 
VSMA Vector scalar multiply and add 
VSMSA Vector scalar multiply and scalar add 
VSMSB “Vector scalar multiply and subtract 
VSMUL Vector scalar multiply 
re END 

YSQ Vector square 
VSQRT Vector square root 
VSSQ Vector signed square 
VSUB Vector subtract 

VSUM Vector sum of elements integration 
VSWAP Vector swap 

VTRAPZ Vector trapezoidal rule integration 
WIENER Wiener Levinson algorithm 

ZMD Clear all main data memory 


872 
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MATHSIM does not include routines which relate to byte packing and 
unpacking, vector logical operations, and support of table memory. The 
routines in the basic AP Math Library and the Signal Processing Library 
which are not included in MATHSIM are: 


VAND 
VEQV 
VOR 
VISMUL 
VUP8 
VUPS8 
VPK8 
VUP16 
VUPS 16 
VPK16 
VFLT32 
VF 1X32 


Neither does MATHSIM support APAL callable utility routines, such as 
DIV and SAVESP. 


The following libraries are not supported by MATHS IM: 
TMRAM library 
Page select/parity library 


IOP library | 
PIOP library 
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52222 DMA 


AP main data (MD) is simulated by the array variable APMD. APMD is 
communicated to the MATHSIM routines by the COMMON block: 


COMMON /COMMD/ APMD (1024) 


APPUT transfers the data taken from an array defined in the user 
program into APMD. APGET transfers data taken from APMD into an array 
defined in the user program. Execution of the MATHSIM routines 
requires a number of pointers in the APMD scratch space. These 
pointers are based on the subroutine call parameters. 


MATHSIM supports APPUT and APGET format types 1 (16-bit integer) and 2 
(host floating-point). : 


Via APEX, MATHSIM handles program source management exactly as it is 
handled during actual AP use. Thus, it is possible to encounter 
program souce overflow, which causes the run to halt- Note that Ps 
size is set in the subroutine APXSET, PSSIZ = 1024. This size can be 
changed by the user. 
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52223 SYNCHRONIZATION 


MATHSIM simulates synchronization by calls to the APEX subroutines 
APWD, APWR and APWAIT. The variables involved in sensing a possible 
synchronization error are communicated to the necessary subroutines via 
the following: 


INTEGER ERRFLG 
LOGICAL DMAFLG, RUNFLG, INIFLG 
COMMON /FLAGS/ DMAFLG,RUNFLG, INIFLG, ERRFLG 


These statements appear in the following five subroutines: 


APPUT 
APGET 
APEX 
APTIME 
APCLR 


All flags are initialized by a call to APCLR. INIFLG is set to .FALSE. 
at compile time (in APCLR). Whenever a call to subroutine APEX is made 
before a call to APCLR or APINIT, the run halts and an error message is 
issued. 


Whenever the user omits an APWD or APWR in the program, a synchro- 
nization warning is tallied (ERRFLG = ERRFLG +1) and the run continues. 
Omission of an APWAIT for either case causes the same results. It is 
important to note that some of this type of errors may escape 
detection. Also, the detection of possible errors does not necessarily 
mean that the program is invalid, as in cases where omission of calls 
to the waiting routines actually produces a more efficient program. 
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MATHSIM checks synchronization in the following ways: 


e When a DMA is initiated, the DMAFLG is set; MATHSIM 
checks the RUNFLG; when the RUNFLG is set, MATHSIM 
tallies a synchronization warning. 


e When an AP subroutine is executed, the RUNFLG is set; 
MATHSIM checks the DMAFLG; when the DMAFLG is set, 
MATHSIM tallies a synchronization warning. 


e When a call is made to APWD, the DMAFLG is turned off; 
when a call is made to APWR, the RUNFLG is turned off; 
when a call is made to APWAIT, both the DMAFLG and the 
RUNFLG are turned off. 


e MATHSIM considers a call to APTIME as host execution. 
When a call is issued to APTIME, the simulator checks 
the DMAFLG and the RUNFLG. If either of these is 
set, MATHSIM tallies a synchronization warning. 


e When a host program is executing with either the 
DMAFLG or the RUNFLG set, but no call is made to APTIME, 
MATHSIM does not detect the possible error. 


5.2.4 TIMING ESTIMATES 


The timing accumulators are communicated to the various MATHSIM 
routines via the following: 


COMMON /TIMING/ MTYPE,CONAPX, CONDMA, TIMRUN , TIMAPX, TIMDMA 


This statement is in all of the algorithm-simulating routines and in 
four APEX subroutines: APEX, APPUT, APGET, and APCLR. The 
accumulators are initialized in APCLR. The following sections define 
and describe the accumulators. . 
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5.2+4.1 TIMRUN 


TIMRUN accumulates the AP program execution time estimates. MATHSIM 
estimates loop times for routines represented by a single loop by using 
the TYPICAL loop times given in Appendix D, plus the SETUP times. The 
SETUP times depend upon memory type which is designated in MATHSIM by 
the variable MTYPE in the preceding COMMON statement. These particular 
programs contain the following statement: 


DATA SETUP(1), eee 


Other routines with more complex algorithms contain timing formulas of 
an empirical nature derived from the timing tables in Appendix E. The 
formulas are established so that extrapolation out of the range of the 
tables gives reasonably accurate timing estimates. Interpolations 
within the range of the tables give an accuracy well within five 
percent. 


50264.2 TIMAPX 


TIMAPX accumulates the estimates for the AP executive (host overhead) 
time. All algorithm-simulating routines contain an integer variable 
SIZE, usually dependent upon MTYPE which contains the AP program source 
word length. Each time a particular program is called, it executes the 
following statement: 


CALL APEX (SIZE) or CALL APEX (SIZE(MTYPE) ) 


The subroutine APEX checks to see whether or not the subroutine is 
loaded. If the program is designated as already AP-resident, MATHSIM 
adds an estimate of the system overhead time onto TIMAPX. The value of 
this estimate is a constant assigned to CONAPX in MATHSIM. When the 
program is not resident in the AP, simulation is affected by estimating 
the time needed to load a program of the specified SIZE. This estimate 
is based on the constant CONDMA. 


Because it is difficult to arrive at a precise value for CONAPX, TIMAPX 
can be only an order of magnitude estimate when CONAPX is involved in 
its calculation. However, program loading times are more accurate 
because CONDMA can be established more precisely. 
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Note that the times for some applications are dominated by TIMAPX. 

This is true for many calls to simple vector manipulations. Using the 
vector function chainer reduces multiple calls to Math Library routines 
to a single call, and thus reduces. system overhead. MATHSIM, however, 
does not simulate the vector function chainer software. Therefore, the 
user must subtract the host overhead estimate (CONAPX) an appropriate 
number of times from the total timing estimate to account for use of 
vector function chaining. 


5-3 SYSTEM DEPENDENCY 


MATHSIM contains several installation-dependent features, as listed 
below. 


e The subroutine APEX calls the FORTRAN function 
ILOC(X), which returns the location of the variable 
X.- This function is FPS~-supplied. 


e The subroutine APCLR assigns the following timing 
parameters: 


MYTPE =n 
CONDMA = ss 
CONAPX = tt 


where: 


n is the memory type; 
1 = fast, 2 = standard 
(default = 2) 


ss is the time per 16-bit word 
transfer (u sec) 
(default = 2 u sec) 


tt is the time per s-pad load 
and go (u sec) 
(default = 1000 u sec) 
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e The subroutine APXSET assigns the program source 
size with the statement: 


PSSIZ =n 
where: 


n is the program source size 
(default = 1024) 


e The main data size is indicated in the following 
statement: 


COMMON /COMMD/ APMD (1024) 
It can be changed by changing the entry in every | 
occurrence of this statement; the statement appears 


in all Math Library routines and in the two APEX 
routines APPUT and APGET. 


e The MATHSIM routines APSTOP and APEX write messages 
to the unit specified in the following statement: 


IOUNIT =n 
where: 


n is the logical unit number 
(default = 1) 
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5-4 EXAMPLES 


This section contains a sample MATHSIM routine and two programming 
examples applicable to MATHSIM. 


5.4.1 EXAMPLE 1: SAMPLE ROUTINE 


The following VADD routine shown is typical of the routines provided in 
MATHS IM: 


Cxkke* VADD = VECTOR ADD = REL 1-0, MAY 78 ***** 
SUBROUTINE VADD(A,1,B,J,C,K,N) 
INTEGER*2 A,1,B,J,C,K,N 
INTEGER*2 M,IA,1IB,IC 
REAL SETUP(2), LOOP(2) 
INTEGER SIZE(2) 
COMMON /TIMING/ MITYPE,CONAPX, CONDMA, TIMRUN, TIMAPX, 
X TIMDMA 
COMMON /COMMD /APMD (1024 ) 
DATA SETUP (1), SETUP(2), LOOP(1), LOOP(2) 
x i 2667 3s 1.17, 0-84 , 1.33 / 
DATA SIZE(L), SIZE(2) / 17, 8/ 
LA=A+1 | 
IB=B+1 
IC=C+l 
DO 100 M=1,N 
APMD (IC )=APMD (IB )+APMD (IA) 
TA=IA+I 
| IB=IB+J 
100 IC=IC+K 
C 
c /* TIMING. */ 
CALL APEX (SIZE(MTYPE) ) 
TIMRUN = TIMRUN + SETUP(MTYPE) + FLOAT(N) * LOOP (MTYPE) 
RETURN 
END 


Note that the user has access to the timing information by including 
the statement COMMON /TIMING/ ... in the calling program. Also, 
because MTYPE is assigned in the call to APCLR, the user may change the 
memory type after that call. This allows the user to easily obtain 
timing for either standard or fast memory- 


FPS R60N-7288-004 5 - 17 


5-4-2 EXAMPLE 2: PROGRAM FOR VECTOR ADDITION 


The following is an example of a calling program to add two vectors; 
this version can be used with either the AP or MATHSIM. 


DIMENSION A(100),B(100),C(100) 


¢ 


C /* TIMING INFO WRITTEN ON LOGICAL UNIT “IOUNIT’.. #*/ 
IOUNIT = 1 
C /* APCLR INITIALIZES TIMING (AMONG OTHER THINGS). */ 
CALL APCLR 


CALL APPUT(A,0, 100, 2) 
CALL APPUT (B, 100,100, 2) 
CALL APWD 
CALL VADD(0,1,100,1,200,1,100) 
CALL APWR 
CALL APGET(C, 200,100, 2) 
C /* WRITE ACCUMULATED TIMES... */ 
CALL APTIME (IOUNIT) 


o 


END 
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5.4.3 EXAMPLE 3: PROGRAM FOR VECTOR ADDITION WITH TIMING 
ESTIMATES 


The following is an example of a program to add two vectors and 


determine timing estimates. This program is written for use with 
MATHSIM only. 


DIMENSION A(100),B(100),C (100) 


¢ 


C /* TIMING INFO WRITTEN ON LOGICAL UNIT “IOUNIT’.. */ 
IOUNIT = 1 

C /* APCLR INITIALIZES TIMING (AMONG OTHER THINGS). */ 
CALL APCLR 


CALL APPUT(A,0,100, 2) 

CALL APPUT (B, 100,100, 2) 

CALL APWD 

CALL VADD(0,1,100,1,200,1,100) 
CALL APWR 

CALL APGET(C, 200,100, 2) 

C /* WRITE ACCUMULATED TIMES... */ 

CALL APTIME (LOUNIT) 


, 


END 


When this program is run with the parameters set to the default values, 
the subroutine APTIME displays the following message: 


AP-120B TIMING ESTIMATES (ACCUM). 


RUN = 0.13417 (MSEC) 
APEX = 1.06400 
DMA = 1.20000 


SYNCHRONIZATION WARNINGS = 1 


To eliminate the synchronization warning, the user should insert a CALL 
APWD after the CALL APGET statement. 
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APPENDIX A ALPHABETICAL INDEX OF AP MATH LIBRARY ROUTINES 


Typical Program 

Execution Size 

Page Name Operation Time/Loop (AP 
(us ) PS words) 


aw op ane ams ewe ee ee ae ee aw a oe om oe oe 


167 4333 bo? 4333 


E-195 ACORF AUTO-CORRELATION (FREQUENCY-DOMAIN ) 1.80* 2.70 501 489 
E-193 ACORT AUTO-CORRELATION (TIME-DOMAIN) 0.29% 0.29 i220: AZ) 
E-270 ADV2 ADVANCE POINTERS AFTER RADIX 2 FFT 0.7 @ 0.7 7 rj 
E-271 ADV4 ADVANCE POINTERS AFTER RADIX 4 FFT 0.7 @ 0.7 7 7 
E-13 APCHK CHECK AP PROGRAM ERROR CONDITION t.# #.# Q 0 
E-8 APCLR INITIALIZE THE AP #.# #. # 0 0 
E-6 APGET GET DATA FROM THE AP #.# #. # 0 ) 
E~12 APGSP READ AN AP S=PAD REGISTER #.# #.# 0 0 
E=4 APPUT PUT DATA INTO THE AP #. # it. # ) 0 
E-14 APSTAT GET AP HARDWARE STATUS . #. # #. # 0 8) 
E-11 APWAIT WAIT FOR AP #. # #. # 0 0 
E-9 APWD WAIT FOR AP DATA TRANSFER #4 ted 0 0 
E-10 APWR WAIT FOR AP PROGRAM EXECUTION #. i te. # 0 0 
E-186 ASPEC ACCUMULATING AUTO-SPECTRUM 0.8 1.5 21 22 
E-234 ATAN SCALAR ARCTANGENT 8.7 @ 8.7 74 74 
E-235 ATN2 SCALAR ARCTANGENT OF Y/X 13.8 @13.8 74 74 
E-261 BITREV COMPLEX VECTOR BIT REVERSE ORDERING 0.9 1.4 45 43 
E-199 CCORF CROSS-CORRELATION (FREQUENCY-DOMAIN) 2-58* 3.93 526 510 
E-197 CCORT CROSS-CORRELATION (TIME=-DOMAIN ) 0.29% 0.29 i2i- 21 
E-115 CDOTPR COMPLEX DOT PRODUCT 0.7 1.3 15 16 
E-156 CFFT COMPLEX TO COMPLEX FFT (IN PLACE) 0.28% 0.40 186 184 
E-167 CFFT2D COMPLEX TO COMPLEX 2-DIMENSIONAL FFT 0.5 * 0.5 274 274 
E-158 CFFTB COMPLEX TO COMPLEX FFT (NOT IN PLACE) 0.20* 0.28 189 189 
E-164 CFFTSC COMPLEX FFT SCALE 0.8 1.3 42 42 
E-268 CLSTAT CLEAR FFT MODE STATUS BITS 0-5 @ 0.5 Lg 19 
E~192 COHER COHERENCE FUNCTION 4.0 4.5 109 114 
E-172 CONV CONVOLUTION (CORRELATION ) 0.28% 0.28 106 106 
E-233 COs SCALAR COSINE 5-4 @ 5.4 33 35 
E-103 CRVADD COMPLEX AND REAL VECTOR ADD 1.3 1.8 14 14 
E-106 CRYDIV COMPLEX AND REAL VECTOR DIVIDE 3.3 3-3 92 92 
E-105 CRVYMUL COMPLEX AND REAL VECTOR MULTIPLY 1.3 1.8 14 14 
E-104 CRVSUB COMPLEX AND REAL VECTOR SUBTRACT 1.3 1.8 14 14 
E-187 CSPEC ACCUMULATING CROSS-SPECTRUM 1.3 2.7 39 40 
E-281 CTOR COMPLEX TO REAL FFT UNSCRAMBLE O.13* 0.13 80 80 
E-149 CTRN3 3-DIMENSION COORDINATE TRANSFORMATION 2.3 * 2.5 a7 cy; 
E-98 CVADD COMPLEX VECTOR ADD 1.0 2.0 13 es 
E=-92 CVCOMB COMPLEX VECTOR COMBINE 1.1 1.7 10 19 
E=97 CVCONJ COMPLEX VECTOR CONJUGATE 0.7 1.3 10 Be 
E-113 CVEXP COMPLEX VECTOR EXPONENTIAL 2.0 2.0 43 43 
E-91 CVFILL COMPLEX VECTOR FILL 0.5 0.7 8 8 
E-107 CVMA COMPLEX VECTOR MULTIPLY AND ADD 1.3 2.7 29 30 
E-109 CVMAGS COMPLEX VECTOR MAGNITUDE SQUARED 0.7 1.2 13 18 
E-114 CVMEXP VECTOR MULTIPLY COMPLEX EXPONENTIAL 2.3 2.3 48 48 
E-90 CVMOV COMPLEX VECTOR MOVE O68: Led g 9 
E-100 CVMUL COMPLEX VECTOR MULTIPLY 1.0 2.0 23 26 
E-96 CVNEG COMPLEX VECTOR NEGATE 0-8 1.3. ll ll 
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E-102 CVRCIP COMPLEX VECTOR RECIPROCAL 52 Swe 50 50 
E-93 CVREAL FORM COMPLEX VECTOR OF REALS 0.8 Lae 9 9 
E-101 CVSMUL COMPLEX VECTOR SCALAR MULTIPLY 0.8 1.3 12 12 
E-99 CVSUB COMPLEX VECTOR SUBTRACT 1.0 220 13 12 
E-257 DAREAD READ DEVICE ADDRESS REGISTER 0.3 @ 0.3 2 2 
E-258 DAWRIT WRITE DEVICE ADDRESS REGISTER 0.3 @ 0.3 2 2 
E-287 DDDA DOUBLE + DOUBLE TO DOUBLE ADD 7.5 @ 7.5 48 43 
E-288 DDDM DOUBLE * DOUBLE TO DOUBLE MULTIPLY 8.5 @18.5 117 117 
E-174 DEQ22 DIFFERENCE EQUATION, 2 POLES, 2 ZEROS 0.8 0.8 29 25 
E-227 DIV SCALAR DIVIDE 3.8 @ 3.8 28 28 
E-66  DOTPR DOT PRODUCT 0.5 0.8 va 9 
E-231 EXP SCALAR EXPONENTIAL 4.2 @ 4.2 28 28 
E-263 FFT2 RADIX 2 FFT FIRST PASS 1.3 207 16 16 
-E-265 FFT2B RADIX 2 FFT FIRST PASS + BIT REVERSE 1.3 Zed 25 figs 
E-264 FFT4 RADIX 4 FFT PASS 3-7 5.3 79 79 
E-266 FFT4B RADIX 4 FFT FIRST PASS + BIT REVERSE 2a 5 e3 43 43 
E-151 FMMM FAST MEMORY MATRIX MULTIPLY 0.43% 61 
E-153 FMMM32 FAST MEMORY MATRIX MULTIPLY (<=32) 0.41% 33 
E-184 HANN HANNING WINDOW MULTIPLY 0.7 0.8 41 41 
E-183 HIST HISTOGRAM Lea 1.4 71 re 
E-269 ILOG2  LOGARITHM (BASE 2) 4.0 @ 4.0 19 19 
E-230 LN SCALAR NATURAL LOGARITHM 4.0 @ 4.0 37 a7 
E-229 LOG SCALAR LOGARITHM (BASE 10) 4.7 @ 4.7 37 37 
E-85 LVEQ LOGICAL VECTOR EQUAL 0.8 1.3 23 13 
E-84 LVGE LOGICAL VECTORGREATER THAN OR EQUAL 0.8 1.3 a3 13 
E-83 LVGT LOGICAL VECTOR GREATER THAN 0.8 1.3 23 13 
E-86 LVNE LOGICAL VECTOR NOT EQUAL 0.8 1.3 23 13 
E-87 LVNOT LOGICAL VECTOR NOT 0.5 0.8 al 12 
E-141 MATINV MATRIX INVERSE 1.6 * 2.1 160 160 
E-69 MAXMGV MAXIMUM MAGNITUDE ELEMENT IN VECTOR 0.3 Q.3 19 19 
E-67 MAXV MAXIMUM ELEMENT IN VECTOR 0.3 0.3 19 19 
E-253 MDCOM MAIN DATA COMPARE AND SET S-PAD 1.8 @ 2.0 11 1 
E-72 MEAMGV MEAN OF VECTOR ELEMENT MAGNITUDES 0.3 0.3 a2 52 
E-71 MEANV MEAN VALUE OF VECTOR ELEMENTS 0.3 0.3 49 49 
E-73 MEASQV MEAN OF VECTOR ELEMENT SQUARES 0.3 0.3 52 52 
E-70 MINMGV MINIMUM MAGNITUDE ELEMENT IN VECTOR 0.3 0.3 19 19 
E-68 MINV MINIMUM ELEMENT IN VECTOR 0.3 0.3 19 19 
E-209 MMTADD VECTOR ADD (MD+MD TO T™) 0.7 0.8 20 eS) 
E-211 MMT™ VECTOR MULTIPLY (MD*MD TO T™) O.7 0.8 20 13 
E-210 MMTSUB VECTOR SUBTRACT (MD-MD TO TM) 0.7 0.8 20 13 
E-137 MMUL MATRIX MULTIPLY 0.62* 0.83 59 59 
E-139 MMUL32 MATRIX MULTIPLY (DIMENSION <=32) 0.50% 0.73 aa 27 
E-206 MTIMOV VECTOR MOVE WITH INCREMENT (MD TO TM) 0-5 0.5 7 i 
E-212 MTMADD VECTOR ADD (MD+TM TO MD) 0.5 0.8 20 9 
E-215 MTMMUL VECTOR MULTIPLY (MD*TM TO MD) 0.5 0.8 20 9 
E-204 MTMOV VECTOR MOVE (MD TO TM) 0.2 0.3 6 7 
E-213  MTMSUB VECTOR SUBTRACT (MD-TM TO MD) 0.5 0.8 20 9 
E-136 MTRANS MATRIX TRANSPOSE 0.5 0.9 18 22 
E-216 MTTADD VECTOR ADD (MD+IM TO T™) 0.5 0.5 20 20 
E-219 MTTMUL VECTOR MULTIPLY (MD*TM TO T) 0.5 O05 20 20 
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E-217 MTTSUB VECTOR SUBTRACT (MD-ITM TO T™) 0.5 0.5 20 20 
E-145 MVML3 TRIX VECTOR MULTIPLY (3X3) 2.0 * 2.2 30 30 
E-147 MVML4 MATRIX VECTOR MULTIPLY (4X4) 3.3 * 3.8 39 39 
E-279 PCFFT PARTIAL COMPLEX FFT 1-.05* 1.50 Li: 217 
E-L11 POLAR RECTANGULAR TO POLAR CONVERSION 19-5 19-5 120 120 
E-255 RDC5 READ CONTROL BIT 5 INTERRUPT 35° OL 9 9 
f-262 REALTR REAL FFT UNRAVEL AND FINAL PASS 0.4 Q.7 68 68 
R-112 RECT POLAR TO RECTANGULAR CONVERS ION 203 2.3 49 49 
E-160 RFFT REAL TO COMPLEX FFT (IN PLACE) 0.18% 0.27 253-251 
E-169 RFFT2D REAL TO COMPLEX 2-DIMENSIONAL FFT 0.4 * 0.4 585 585 
E-162 RFFTB REAL TO COMPLEX FFT (NOT IN PLACE) 0.14% 0.20 252 .252 
E-165 RFFTSC REAL FFT SCALE AND FORMAT - 0.7 0.8 59 59 
E-74 RMSQV ROOT-MEAN-SQUARE OF VECTOR ELEMENTS 0.3 0.3 22) 2. B81 
E-282 RTOC REAL TO COMPLEX FFT SCRAMBLE 0.09* 0.09 143 143 
E-248 SAVESP SAVE S=PAD INTO PROGRAM MEMORY 0.8 * 0.8 18 18 
E-249 SAVSPO SAVE S=PAD 0 INTO PROGRAM MEMORY 2.0 * 2.0. 11 Ll 
F-110 SCJMA SELF-CONJUGATE MULTIPLY AND ADD 0.8 1.5 14 15 
E-286 SDDA SINGLE + DOUBLE TO DOUBLE ADD 4.5 @ 4-5 28 28 
E-272 SET24B SETUP FOR FFT23 AND FFT4B Ve2 @ led 8 8 
E-252 SET2SP LOAD 2 S-PADS FROM PROGRAM MEMORY 5.7 @ 5-7 33 33 
E-256 SETCS SET CONTROL BIT 5 INTERRUPT 0.2 @ 0.2 1 1 
E-250 SETSP LOAD S-PADS FROM PROGRAM MEMOKY 203% 243 33 33 
E-232 SIN SCALAR SINE 4.9 @ 4.9 39 35 
E-143 SOLVEQ LINEAR EQUATION SOLVER 0.7 * 0-9 216) 222 
E-239 SPADD S-PAD ADD 0.2 @ 0-2 1 1 
E-245 SPAND S=PAD AND 0.2 @ 0.2 1 i 
E-242 SPDIV S-PAD DIVIDE G@e2 “@ 602 43 43 
E-236 SPFLT FLOAT S-PAD INTEGER 0.8 @ 0-8 5 3 
E-244 SPLS S-PAD LEFT SHIFT 0.3 * 0-3 5 5 
E-241 SPMUL S=-PAD MULTIPLY 2.3: @ 223 14 14 
E-238 SPNEG S-PAD NEGATE 0.3 @ 0-3 2 2 
E-247 SPNOT S-PAD NOT 0.2 @ 0.2 1 i 
E-246 SPOR S-PAD OR 0.2 @ 0-2 l 1 
E-243 SPRS S-PAD RIGHT SHIFT 0.3 * 0.3 5 5 
z-240 S$PSUB S-PAD SUBTRACT 0.2 @ 0.2 1 1 
E-237 SPUFLT S=-PAD UNSIGNED FLOAT 0.8 @ 0.8 8 8 
E-228 SQRT SCALAR SQUARE ROOT 3.8 @ 3.8 28 28 
E-284 SSDA SINGLE + SINGLE TO DOUBLE ADD 165 @ 125 10 10 
E-285 SSDM SINGLE * SINGLE TO DOUBLE MULTIPLY 11.5 Gii.3 81 81 
E-267 STSTAT SET FFT MODE STATUS BITS 5.0 @ 5.0 19 19 
E-62 SVE SUM OF VECTOR ELEMENTS 0.3 0.3 7 y 
E-63 S VEMG SUM OF VECTOR ELEMENT MAGNITUDES 0.3 0.3 10 10 
E-64 SVESQ SUM OF VECTOR ELEMENT SQUARES 0.3 Q.3 10 10 
E-65 SVS SUM OF VECTOR SIGNED SQUARES 0.3 0.3 Ll LL 
E-201 TCONV POSTTAPERED CONVOLUTION (CORRELATION ) 0.30* 0.30 Hi2> YZ 
E-207 TMIMOV VECTOR MOVE WITH INCREMENT (TM TO MD) 0.3 0.3 15 15 
 B-205 TMMOV VECTOR MOVE (TM TO MD ) 0.2 0.3 5 5 
E-214 TMMSUB VECTOR SUBTRACT (TM-MD TO MD) 0.5 0.8 20 9 
E-218 TMTISUB VECTOR SUBTRACT (TM-MD TO TM) 0.5 0.5 20 20 
E-191 TRANS TRANSFER FUNCTION 3.3 3-3 100 100 
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Page Name Operation Time /Loop (AP 
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E-208 TTIMOV VECTOR MOVE WITH INCREMENT (TM TO TM) 0.5 0.5 7 7 
E-220 TTMADD VECTOR ADD (TM+TM TO MD) 0.5 0.5 20 20 
E-222 TYIMMUL VECTOR MULTIPLY (TM*IM TO MD) O.5 0.5 20 20 
E-221 TIMSUB VECTOR SUBTRACT (TM-TM TO MD) 0.5 0.5 20 20 
E-223 TTTADD VECTOR ADD (TM+ITM TO T™) 0.7 0.7 ) 9 
E-225 TITMUL VECTOR MULTIPLY (TM*I™M TO TM) 0.7 0.7 10 10 
E-224 TTTSUB VECTOR SUBTRACT (TM-IM TO T™) Q.7 0.7 9 9 
E-53 VAAM VECTOR ADD, ADD, AND MULTIPLY 1.5 23 13 20 
E-32 VABS VECTOR ABSOLUTE VALUE Q.5 0.8 17 7 
E=-22 VADD VECTOR ADD 0.8 1.3 20 8 
E~36 VALOG VECTOR ANTILOGARITHM (BASE 10) 2.3 2.3 58 58 
E-48 VAM VECTOR ADD AND MULTIPLY 1.2 1.8 23 14 
E=55 VAND VECTOR LOGICAL AND 0.8 1.3 20 8 
E-40 VATAN VECTOR ARCTANGENT _ 967 9.8 S7 87 
E-41 VATN2 VECTOR ARCTANGENT OF Y/X 14.2 14.2 88 88 
E-189 VAVEXP VECTOR EXPONENTIAL AVERAGING 0-8 1.3 55 46 
E-188 VAVLIN VECTOR LINEAR AVERAGING 0.8 1.3 54 46 
E~80 VCLIP VECTOR CLIP 0.5 0.8 16 16 
E-16 VCLR VECTOR CLEAR 0.2 0.3 16 4 
E-39 vcos VECTOR COSINE 1.3 <3 34 34 
E-190 VDBPWR VECTOR CONVERSION TO DB (POWER) 1.2 1.3 75 i2 
E=-25 VDIV VECTOR DIVIDE 1.7 1.7 75 i 
E~56 VEQV VECTOR LOGICAL EQUIVALENCE 0.8 1.3 20 8 
E-~37 VEXP VECTOR EXPONENTIAL 2.3 2-3 55 55 
E-259 VFCLI VECTOR FUNCTION CALLER (1 ARGUMENT) 0.38 1.0 10 10 
E-260 VFCL2 VECTOR FUNCTION CALLER (2 ARGUMENT ) 1.0 1.0 11 Bi 
E-19 VFILL VECTOR FILL 0.3 0.3 5 5 
E-118 VFIX VECTOR INTEGER FIX Q.7 0.38 18 7 
E-133 VFIX32 VECTOR 32-BIT INTEGER FIX 1.2 1.2 33 33 
E-117 VFLT VECTOR INTEGER FLOAT 0.5 0.8 LS Li 
E-132 VFLT32 VECTOR 32-BIT INTEGER FLOAT 1.7 1.7 65 65 
E-58 VFRAC VECTOR TRUNCATE TO FRACTION 0.7 0.8 13 13 
E-81 VICLIP VECTOR INVERTED CLIP 0.7 0.8 19 Lo 
E-95 VIMAG EXTRACT IMAGINARIES OF COMPLEX VECTOR 0.5 0.8 18 8 
E-60 VINDEX VECTOR INDEX 0.8 1.3 28 26 
E-59 VINT VECTOR TRUNCATE TO INTEGER 0.5 0.8 9 2 
E-82 VLIM VECTOR LIMIT 0.5 0.8 14 14 
E-88 VLMERG VECTOR LOGICAL MERGE 0.8 1.5 a3 16 
E-35 VLN VECTOR NATURAL LOGARITHM 2.7 207 42 42 
E-34 VLOG VECTOR LOGARITHM (BASE 10) Ze7 2.7 54 58 
E-46 VMA VECTOR MULTIPLY AND ADD Led 1.8 23 15 
E-76 VMAX VECTOR MAXIMUM 0.8 1.3 2a 13 
E-78 VMAXMG VECTOR MAXIMUM MAGNITUDE 0.8 1.3 14 14 
E-~77 VMIN VECTOR MINIMUM 0.8 1.3 ta 13 
E-79 VMINMG VECTOR MINIMUM MAGNITUDE 0.8 1.3 14 14 
E-51 VMMA VECTOR MULTIPLY, MULTIPLY, AND ADD 1.5 2.3 Za 19 
E=-52 VMMSB VECTOR MULTIPLY MULTIPLY AND SUBTRACT 1.5 2.3 a7 19 
E-17 VMOV VECTOR MOVE 0.5 0.8 16 6 
E-43 VMSA VECTOR MULTIPLY AND SCALAR ADD 0.8 1.3 23 14 
E-47 VMSB VECTOR MULTIPLY AND SUBTRACT 1.2 1.8 23 bs) 
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E-24 VMUL VECTOR MULTIPLY 0.8 1.3 20 ll 
E~21 VNEG VECTOR NEGATE 0.5 0.8 18 7 
E-57 VOR VECTOR LOGICAL OR 0.8 1.3 20 8 
E-131 VPK16 VECTOR 16-BIT BYTE PACK 0.8 0.8 46 46 
E-128 VPK8 VECTOR 8=-BIT BYTE PACK - 0.9 0.9 65 65 
E-175 VPOLY VECTOR POLYNOMIAL EVALUATION 1.0 * 1.2 41 41 
E=20 VRAMP VECTOR RAMP 0.3 0.3 12 12 
E=-42 VRAND VECTOR RANDOM NUMBERS 1.2 1.2 16 16 
E-94 VREAL EXTRACT REALS OF COMPLEX VECTOR 0.5 0.8 17 7 
E=26 VSADD VECTOR SCALAR ADD 0.5 0.8 19 8 
E-49 VSBM VECTOR SUBTRACT AND MULTIPLY Led 1.8 23 14 
E-54 VSBSBM VECTOR SUBTRACT SUBTRACT AND MULTIPLY 1.5 2-3 13 20 
E-121 VSCALE VECTOR SCALE (POWER 2) AND FIX 0.7 0.8 12 12 
E-123 VSCSCL VECTOR SCAN, SCALE (POWER 2) AND FIX 1.5 1.7 19 19 
E-134 VSEFLT VECTOR SIGN EXTEND AND FLOAT 0.8 0.8 15 15 
E-125 VSHFX VECTOR SHIFT AND FIX 0.7 0.8 | 9 
E-179 VSIMPS VECTOR SIMPSONS 1/3 RULE INTEGRATION 0.7 0.8 25 25 
E-38 VSIN VECTOR SINE 1.3 1.3 34 34 
=E-44 VSMA VECTOR SCALAR MULTIPLY AND ADD — 0.38 1.3 Zi 14 
E-120 VSMAFX VECTOR SCALAR MULTIPLY, ADD, AND FIX 0.7 0.8 14 13 
E-50 VSMSA VECTOR SCALAR MULTIPLY AND SCALAR ADD 0-5 0.8 23 15 
E=-45 VSMSB VECTOR SCALAR MULTIPLY AND SUBTRACT 0.8 1.3 21 14 
E-27 VSMUL VECTOR SCALAR MULTIPLY 0.5 0.8 20 9 
E-30 VSQ VECTOR SQUARE 0.5 0.8 9 9 
E~33 VSQRT VECTOR SQUARE ROOT 1.8 1.8 79 79 
E-31 VSSQ VECTOR SIGNED SQUARE 0.5 0.8 ZL 9 
E-23 VSUB VECTOR SUBTRACT 0.8 1.3 20 8 
E-177 VSUM VECTOR SUM OF ELEMENTS INTEGRATION 0.7 0.8 13 te 
E-18 VSWAP VECTOR SWAP . 1.2 1.5 21 i2 
E-178 VTRAPZ VECTOR TRAPEZOIDAL RULE INTEGRATION 0.7 0.8 16 16 
E-28 VISADD VECTOR TABLE SCALAR ADD 0.5 0.8 8 8 
E<-29 VTSMUL VECTOR TABLE SCALAR MULTIPLY 0.5 0.8 8 8 
E-129 VUP16 VECTOR 16-BIT BYTE UNPACK 0.8 0.8 61 61 
E-126 VUP8 VECTOR 8-BIT BYTE UNPACK 0.5 0.5 71 71 
E-130 VUPS16 VECTOR 16-BIT SIGNED BYTE UNPACK 1.3 1.3 58 58 
E-127 VUPS8 VECTOR 8=BIT SIGNED BYTE UNPACK 0.9 0.9 107 107 
E-180 WIENER WIENER LEVINSON ALGORITHM 0.50* 0.65 100 100 
E-277 XBITRE EXPANDED BIT REVERSE 367 3.7 44 44 
E-273 XCFFT EXPANDED COMPLEX FFT 0.32% 0.42 187 187 
E-280 XFFT4 EXPANDED RADIX 4 FFT PASS 3.7 503 79 79 
E-278 XREALT EXPANDED REAL FFT FINAL PASS 0.4 0.7 71 71 
E-275 XRFFT EXPANDED REAL FFT 0.19* 0.28 256 256 
E-254 ZMD CLEAR ALL PAGES OF MAIN DATA MEMORY 0.2 0.3 29 29 


Notes: #.# Timing host system dependent 
* Refer to description of routine for explanation of timing 
@ Total execution time 
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DATA TRANSFER AND CONTROL OPERATIONS (APEX) 


E=-4 APPUT PUT DATA INTO THE AP #.# #. i 0 9) 
E-6 APGET GET DATA FROM THE AP it. # #.# 0 9 
E-8 APCLR INITIALIZE THE AP fe it #. # 0 Q 
E-9 APWD WAIT FOR AP DATA TRANSFER #. i #. # 0 0 
E-10 APWR WAIT FOR AP PROGRAM EXECUTION # ei # ed 0 0 
E-11 APWAIT WAIT FOR AP #. # Pet 0 0 
E-12 APGSP READ AN AP S=-PAD REGISTER #e i #. # 0 0) 
E-13 APCHK CHECK AP PROGRAM ERROR CONDITION #. 4 #. i# 0 0 

0 0 


E-14 APSTAT GET AP HARDWARE STATUS it. # #. i 


E-16 VCLR VECTOR CLEAR 


0.2 0.3 16 4 
E~17 YMOV VECTOR MOVE 0.5 0.8 16 6 
E-18 VSWAP VECTOR SWAP 1.2 1.5 ra LZ 
E-19 VFILL VECTOR FILL 9.3 0.3 =) 5 
E-20 VRAMP VECTOR RAMP 0.3 0.3 i Le 
E-~21 VNEG VECTOR NEGATE 0.5 0.8 18 Z 
E~22 VADD VECTOR ADD 0.8 1.3 20 8 
E~23 VSUB VECTOR SUBTRACT 0.8 1.3 20 8 
E=-24 VMUL VECTOR MULTIPLY 0.8 Les 20 Ll 
E=-25 VDIV VECTOR DIVIDE 1.7 1.7 75 75 
E=-26 VSADD VECTOR SCALAR ADD 0.5 0.8 19 8 
E~27 VSMUL VECTOR SCALAR MULTIPLY 0.5 0.8 20 9 
E-28 VISADD VECTOR TABLE SCALAR ADD 0.5 0.8 8 8 
E=29 VTSMUL VECTOR TABLE SCALAR MULTIPLY 0.5 0.8 8 8 
E-30 VSQ VECTOR SQUARE 0.5 0.8 9 ] 
E=31 VSSQ VECTOR SIGNED SQUARE 0.5 0.8 ap 2 
E-32 VABS VECTOR ABSOLUTE VALUE 0.5 0.3 17 7 
E-33 VS QRT VECTOR SQUARE ROOT 1.8 1.8 72 79 
E-34 VLOG VECTOR LOGARITHM (BASE 10) 267 2.7 54 58 
E-35 VLN VECTOR NATURAL LOGARITHM cane Led 42 42 
E-36 VALOG VECTOR ANTILOGARITHM (BASE 10) 203 223 > 58 
E=37 VEXP VECTOR EXPONENTIAL 2.3 2.3 55 55 
E-38 VSIN VECTOR SINE [es 1.3 34 34 
E=39 YCos VECTOR COSINE ie 3 1.3 34 34 
E~40 VATAN VECTOR ARCTANGENT 9.7 9.8 87 87 
E-4l1 VATN2 VECTOR ARCTANGENT OF Y/X 14.2 14.2 88 88 

1.2 1.2 16 16 


E~42 VRAND VECTOR RANDOM NUMBERS 
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E-43 VMSA VECTOR MULTIPLY AND SCALAR ADD 0.8 1.3 23 14 
E=44 VSMA VECTOR SCALAR MULTIPLY AND ADD 0.8 Lie 3. 21. 14 
E-45 VSHSB VECTOR SCALAR MULTIPLY AND SUBTRACT 0.8 1.3 ox 14 
E-46 VMA VECTOR MULTIPLY AND ADD ee 1.8 25 15 
E-47 VMSB VECTOR MULTIPLY AND SUBTRACT 1.2 1.8 23 [5 
E-48 VAM VECTOR ADD AND MULTIPLY 1.2 1.8 ps 14 
E-49 VSBM VECTOR SUBTRACT AND MULTIPLY 1.2 1.8 23 14 
E-50 VSMSA VECTOR: SCALAR MULTIPLY AND SCALAR ADD 0.5 0.8 23 15 
E=-51l VMMA VECTOR MULTIPLY, MULTIPLY, AND ADD 1.5 23 27 19 
E-52 VMMSB VECTOR MULTIPLY MULTIPLY AND SUBTRACT 1.5 2.3. 27 19 
E-53 VAAM VECTOR ADD, ADD, AND MULTIPLY 1.5 203 13 20 
E-54 VSBSBM VECTOR SUBTRACT SUBTRACT AND MULTIPLY 1.5 263. 13 20. 
E=55 VAND VECTOR LOGICAL AND 0.8 1.3 20 8 
E-56 VEQV VECTOR. LOGICAL EQUIVALENCE 0.8 1.3 20 8 
E~57 VOR VECTOR LOGICAL OR 0.8 1.3 20: 8 
E-58 VFRAC VECTOR TRUNCATE TO FRACTION 0.7 0.8 13 13 
E-59 VINT VECTOR TRUNCATE TO INTEGER 0.5 0.8 9 9 
E-60 VINDEX VECTOR INDEX 0-8 1.3 28: 2 
VECTOR-TO=SCALAR OPERATIONS 
E~62 SVE SUM OF VECTOR ELEMENTS 0.3 0.3 7 7 
E-63 S VEMG SUM OF VECTOR ELEMENT MAGNITUDES 0.3 0.3 10 10 
E-64 SVESQ SUM OF VECTOR ELEMENT SQUARES 0.3 0.3 10 10 
E-65 SVS SUM. OF VECTOR SIGNED SQUARES 0.3 0.3 Le ip 
E-60 DOTPR DOT PRODUCT 0.5 0.8 21 9 
E-67 MAXV MAXIMUM ELEMENT IN VECTOR 0.3 0.3 19 19 
E-68 MINV MINIMUM ELEMENT IN VECTOR 0.3 0.3 ss 19 
E-69 MAXMGV MAXIMUM MAGNITUDE ELEMENT IN VECTOR 0.3 0.3 Lg 19 
E~-70 MINMGV MINIMUM MAGNITUDE ELEMENT IN VECTOR 0.3 0.3 19 19 
E-71 MEANV MEAN VALUE OF VECTOR ELEMENTS 0.3 0.3 49 49 
E~72 MEAMGV MEAN OF VECTOR ELEMENT MAGNITUDES 0.3 0.3 52 SZ 
E~-73 MEASQV MEAN OF VECTOR ELEMENT SQUARES 0.3 0.3 32 By, 
E-74 RMSQV ROOT-MEAN-SQUARE OF VECTOR ELEMENTS 0.3 0.3 81 81 
VECTOR COMPARISON OPERATIONS 

E-76 VMAX VECTOR MAXIMUM 0.8 1.3 oe 13 
E@/7 VMIN VECTOR MINIMUM 0.8 1.3 22 13 
E-78 VMAXMG VECTOR MAXIMUM MAGNITUDE 0.8 1.3 14 14 
E~79 YMINMG VECTOR MINIMUM MAGNITUDE 0.8 1.3 14 14 
E-80 VCLIP VECTOR CLIP 0.5 0.8 16 16 
E-81 VICLIP VECTOR INVERTED CLIP 0.7 0.8 19 rg 
E-82 VLIM VECTOR LIMIT 0.5 0.8 14 14 
E-83 LVGT LOGICAL VECTOR GREATER THAN 0.8 1.3 23 {3 
E~84 LVGE LOGICAL VECTORGREATER THAN OR EQUAL 0.8 1.3 23 13 
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E-85  LVEQ LOGICAL VECTOR EQUAL 0.8 1.3 23.13 
E-86  LVNE LOGICAL VECTOR NOT EQUAL 0-8 1.43 23° 43 
E-87  LVNOT LOGICAL VECTOR NOT 0.5 0.8 21 12 
E-88 VLMERG VECTOR LOGICAL MERGE O68 1S5 23.16 
COMPLEX VECTOR ARITHMETIC 
E-90 CVMOV COMPLEX VECTOR MOVE Ocd° 1463 9 g 
E-91  CVFILL COMPLEX VECTOR FILL O65: “O08? 8 8 
E-92 CVCOMB COMPLEX VECTOR COMBINE ede, -bet. 10 ©610 
E-93  CVREAL FORM COMPLEX VECTOR OF REALS O28 Wo 9 Q 
E-94 VREAL EXTRACT REALS OF COMPLEX VECTOR 0.5 0.8 L7 7 
E-95 VIMAG EXTRACT IMAGINARIES OF COMPLEX VECTOR 0.5 0.8 18 8 
E-96 CVNEG COMPLEX VECTOR NEGATE O58: -443 ll ll 
E-97  CVCONJ COMPLEX VECTOR CONJUGATE Ger 163 1O< “12 
E~98 CVADD COMPLEX VECTOR ADD 1-0 2.0 13-12 
E-99 CVSUB COMPLEX VECTOR SUBTRACT 10° 250 13 -%2 
E-100 CVMUL COMPLEX VECTOR MULTIPLY P.O 240 25 26 
E-101 CVSMUL COMPLEX VECTOR SCALAR MULTIPLY 0.8 1.3 12- a2 
E-102 CVRCIP COMPLEX VECTOR RECIPROCAL Seo. “S82 50 ~=650 
E-~103  CRYADD COMPLEX AND REAL VECTOR ADD a3. 128 14 14 
E-104 CRVSUB COMPLEX AND REAL VECTOR SUBTRACT 1.3 1.8 14 14 
E-105 CRVMUL COMPLEX AND REAL VECTOR MULTIPLY lad - ‘dad 14 14 
E-106 CRVDIV COMPLEX AND REAL VECTOR DIVIDE 3e3°. eS 92 92 
E-107 CVMA - COMPLEX VECTOR MULTIPLY AND ADD ted “Dred 29 = 30 
E-109 CVMAGS COMPLEX VECTOR MAGNITUDE SQUARED Cay, az 13 «#18 
E-110 SCJMA  SELF-CONJUGATE MULTIPLY AND ADD 0.8 1.5 ta~ 215 
E-lll POLAR RECTANGULAR TO POLAR CONVERSION 19.5 19.5 120 120 
E-112 RECT POLAR TO RECTANGULAR CONVERSION 3° 283 49 49 
E-113  CVEXP COMPLEX VECTOR EXPONENTIAL 220- 2.0 43-43 
E-114 CVMEXP VECTOR MULTIPLY COMPLEX EXPONENTIAL P35 .2%5 48 48 
E-115 CDOTPR COMPLEX DOT PRODUCT Os]. das 15 616 
DATA FORMATING OPERATIONS 
E=117 VFLT VECTOR INTEGER FLOAT 9.5 9.8 i3 sll 
E-118 VFIX VECTOR INTEGER FIX 0.7 0-8 18 7 
E-120 VSMAFX VECTOR SCALAR MULTIPLY, ADD, AND FIX 0.7 0.8 14 13 
E-121 VSCALE VECTOR SCALE (POWER 2) AND FIX G7: “0.3 12 12 
E-123 VSCSCL VECTOR SCAN, SCALE (POWER 2) AND FIX 1.5 1.7 19 #19 
E-125 VSHFX VECTOR SHIFT AND FIX 0.7 0.8 9 9 
E-126 VUP8 VECTOR 8-BIT BYTE UNPACK 0.5 0.5 7. 7 
E-127 VUPS8 VECTOR 8-BIT SIGNED BYTE UNPACK 0.9 0.9 107. 107 
E-128 VPK8 VECTOR 8-BIT BYTE PACK 0.9 0.9 65 65 
E-129 VUP16 VECTOR 16-BIT BYTE UNPACK 0.8 0.8 61 «61 
E-130 VUPS16 VECTOR 16-BIT SIGNED BYTE UNPACK Io: “Aas 58 58 
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Typical Program 


Execution Size 
Page Name Operation Time/Loop (AP 
(us ) PS words) 


167 | 333 167 | 333 


E-131 VPK16 VECTOR 16-BIT BYTE PACK 0.8 0.8 46 46 
E-132 VFLT32 VECTOR 32-BIT INTEGER FLOAT 1.7 1.7 65 65 
E~-133 VFIX32 VECTOR 32-BIT INTEGER FIX 1.2 1.2 33 33 
E-134 VSEFLT VECTOR SIGN EXTEND AND FLOAT 9.8 0.8 15 15 
MATRIX OPERATIONS 
E-136 MTRANS MATRIX TRANSPOSE 0.5 0.9 18 2 
E-137 MMUL MATRIX MULTIPLY 0.62* 0.83 Bie) 59 
_ E-139 MMUL32 MATRIX MULTIPLY (DIMENSION <=32) 0.50* 0.73 27 2d 
E-141 MATINV MATRIX INVERSE 1.6 * 2.1 160 160 
E-143 SOLVEQ LINEAR EQUATION SOLVER 0.7 * 0.9 200° -222 
E-145 MVML3 MATRIX VECTOR MULTIPLY (3X3) 2-0 * 2.2 30 30 
E-147 MVML4 $$ MATRIX VECTOR MULTIPLY (4X4) 303 * 3.8 39 39 
E-149 CTRN3 3-DIMENSION COORDINATE TRANSFORMATION 2.3 * 2.5 / a] 
E-151 FMM FAST MEMORY MATRIX MULTIPLY 0.43% 61 
E-153  FMMM32 FAST MEMORY MATRIX MULTIPLY (<=32) 0.41% 35 
FFT OPERATIONS 
E-156 CFFT COMPLEX TO COMPLEX FFT (IN PLACE) 0.28% 0.40 186 184 
E~158 CFFTB COMPLEX TO COMPLEX FFT (NOT IN PLACE) 0.20* 0.28 139 189 
E-160 RFFT REAL TO COMPLEX FFT (IN PLACE) 00.18% 0.27 293: sZol 
E-162 RFFTB REAL TO COMPLEX FFT (NOT IN PLACE) 0.14% 0.20 292. 252 
E-164 CFFTISC COMPLEX FFT SCALE 0.8 1.3 42 42 
E-165 RFFTSC REAL FFT SCALE AND FORMAT O.7 0.8 59 a2 
E-167 CFFT2D COMPLEX TO COMPLEX 2-DIMENSIONAL FFT 0.5 * 0.5 274 274 
E-169 RFFT2D REAL TO COMPLEX 2-DIMENSIONAL FFT 0.4 * 0.4 285 385 
AUXILIARY OPERATIONS 
E-172 CONV CONVOLUTION (CORRELATION ) 0.28% 0.28 106 106 
E-174 DEQ22 DIFFERENCE EQUATION, 2 POLES, 2 ZEROS 0.8 0.8 ve 22 
E-175 VPOLY VECTOR POLYNOMIAL EVALUATION 1.0 * 1.2 41 4i 
E-177 VSUM VECTOR SUM OF ELEMENTS INTEGRATION 0.7 0.8 13 i) 
E-178 VTRAPZ VECTOR TRAPEZOIDAL RULE INTEGRATION 0.7 0.8 16 16 
E-179 VSIMPS VECTOR SIMPSONS 1/3 RULE INTEGRATION 0.7 0.8 25 25 
E-180 WIENER WIENER LEVINSON ALGORITHM 0.50* 0.65 100 100 
SIGNAL PROCESSING OPERATIONS (optional) 
E-183 HIST HISTOGRAM 1.3 1.4 71 71 
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Typical Program 
Execution Size 
Page Name Operation Time/Loop (AP 
(us ) PS words) 
167 |-333 167 12 333 
—£-184 HANN HANNING WINDOW MULTIPLY 0.7 0.8 4} 41 
E-186 ASPEC ACCUMULATING AUTO-SPECTRUM 0.8 i: 21 a2 
E-187 CSPEC ACCUMULATING CROSS-SPECTRUM 1.3 Lad 39 49 
E-188 VAVLIN VECTOR LINEAR AVERAGING 0.8 Le3 54 46 
E-189 VAVEXP VECTOR EXPONENTIAL AVERAGING 0.8 le3 55 46 
E-190 VDBPWR VECTOR CONVERSION TO DB (POWER) bare L3 15 75 
E-~-191 TRANS TRANSFER FUNCTION 3.3 3.3 100 100 
E-192 COHER COHERENCE FUNCTION 4.0 4.5 109 114 
E-193 ACORT AUTO-—CORRELATION (TIME-DOMAIN) 0.29% 0.29 i a aes! 
E-195 ACORF AUTO=CORRELATION (FREQUENCY-DOMAIN ) 1.80% 2.70 501 489 
E-197 CCORT CROSS=-CORRELATION (TIME-DOMAIN ) 0.29% 0.29 121 12 
E-199 CCORF CROSS-CORRELATION (FREQUENCY-DOMAIN ) 2458* 3<93 526. 510 
E-201 TCONV POSTTAPERED CONVOLUTION (CORRELATION) 0.30% 0.30 Li2). 212 
TABLE MEMORY OPERATIONS (optional) 
E-204 MIMOV VECTOR MOVE (MD TO TM) 0.2 0.3 6 7 
E-205 TMMOV VECTOR MOVE (TM TO MD) 0.2 0.3 5 | 
E-206 MTIMOV VECTOR MOVE WITH INCREMENT (MD TO T™) 0-5 0.5 7 7 
E-207  TMIMOV VECTOR MOVE WITH INCREMENT (TM TO MD) O03 0.3 15 iS 
E-208 TTIMOV VECTOR MOVE WITH INCREMENT (IM TO T™) 0.5 Q.5 7 7 
£-209 MMTADD VECTOR ADD (MD+MD TO 1) 0.7 0.8 20 13 
E-210 MMTSUB VECTOR SUBTRACT (MD-MD TO TM) 0.7 0.8 20 13 
E-211 MMTMUL VECTOR MULTIPLY (MD*MD TO T*) O.7 0.3 20 13 
H-212 MTMADD VECTOR ADD (MD+TM TO MD) 0.5 0-8 20 9 
E-213  MTIMSUB VECTOR SUBTRACT (MD-TM TO MD) 0.5 0.8 20 9 
E-214 TMMSUB VECTOR SUBTRACT (ITM-MD TO MD) 0.5 0.8 20 3 
E-215 MTIMMUL VECTOR MULTIPLY (MD*TM TO MD) 0.5 0.8 20 9 
E-216 MTTADD VECTOR ADD (MD+TM TO TM) 0.5 Q.5 20 20 
E-217 MTTSUB VECTOR SUBTRACT (MD-TM TO TM) 0.5 0.5 20 29 
E-218 ‘TMTSUB VECTOR SUBTRACT (TM-MD TO TM) 0.5 0.5 20 20 
E-219 MITMUL VECTOR MULTIPLY (MD*TM TO TM) 0.5 0.5 20 20 
E-220 TYMADD VECTOR ADD (TM+TM TO MD) 0.5 0.5 20 20 
E-221 TIMSUB VECTOR SUBTRACT (TM-TM TO MD) 0.5 0.5 20 20 
E-222 TIMMUL VECTOR MULTIPLY (TM*TM TO MD) 0.5 0.5 20 20 
E-223 TTTADD VECTOR ADD (TM+IM TO TM) 0.7 0.7 9 9 
E-224 TTTSUB VECTOR SUBTRACT (TM-TM TO T™) 0.7 0.7 9 g 
E-225 TTT VECTOR MULTIPLY (TM*TM TO TM) 0.7 0.7 10 10 
APAL=CALLABLE UTILITY OPERATIONS 

E-227 DIV SCALAR DIVIDE 3-8 @ 3.8 28 28 
E-228 SQRT SCALAR SQUARE ROOT . 3-8 @ 3.8 2 28 
E-229 LOG SCALAR LOGARITHM (BASE 10) 4.7 @ 4.7 37 37 
E-230 LN SCALAR NATURAL LOGARITHM 4.0 @ 4.0 37 37 
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Typical Program 


Execution Size 
Page Name Operation Time/Loop (AP 
(us ) PS words) 


E=xZ31-. EXP SCALAR EXPONENTIAL 4.2 @ 4.2 28 Ze 
E-232 SIN SCALAR SINE 4.9 @ 4.9 35 i) 
E=-233 Cos SCALAR COSINE 564 @ 5.4 Se) 35 
E-234 ATAN SCALAR ARCTANGENT 8.7 @ 8.7 74 74 
E-235 ATN2 SCALAR ARCTANGENT OF Y/X 13.8 @13.8 4 74 
E-236 SPFLT FLOAT S=PAD INTEGER 0.8 @ 0.8 5 5 
E~237 SPUFLT S-PAD UNSIGNED FLOAT 0.8 @ 0.8 8 8 
E-238 SPNEG S-PAD NEGATE 0-3 @ 0.3 2 2 
E=-239 SPADD S=PAD ADD O-2 @ 0.2 i 1 
E-240 SPSUB S-PAD SUBTRACT 0.2 @ 0.2 1 1 
E-241 SPMUL S=PAD MULTIPLY 203 @ 2.3 14 14 
E-242 SPDIV S=PAD DIVIDE 6.2 @ 6.2 43 43 
E-243 SPRS S-PAD RIGHT SHIFT 0-3 * 0.3 5 
E-244 SPLS S-PAD LEFT SHIFT 0.3 * 0.3 5 5 
E=245 SPAND S-PAD AND 0.2 @0.2 l l 
E~246 SPOR S-PAD OR 0.2 @ 0.2 1 1 
E-247 SPNOT S-PAD NOT 0.2 @ 0.2 l i 
E-248 SAVESP SAVE S=PAD INTO PROGRAM MEMORY 0.8 * 0.8 18 18 
E-249 SAVSPO SAVE S=PAD 0 INTO PROGRAM MEMORY 2.0 * 2.0 ut i 
E-250 SETSP LOAD S=PADS FROM PROGRAM MEMORY 2-3 * 2.3 33 33 
E-252 SET2SP LOAD 2 S=PADS FROM PROGRAM MEMORY 5-7 @ 5.7 33 33 
E-253  MDCOM MAIN DATA COMPARE AND SET S~PAD 1.8 @ 2.0 li 11 
E-254 ZMD CLEAR ALL PAGES OF MAIN DATA MEMORY 0.2 0.3 29 29 
B=255. RDCS READ CONTROL BIT 5 INTERRUPT 1.5 @ 1.5 | a 
E-256 SETCS SET CONTROL BIT 5 INTERRUPT 0.2 @ 0.2 l i 
E-257 DAREAD READ DEVICE ADDRESS REGISTER 0-3 @0.3 2 Z 
E-258 DAWRIT WRITE DEVICE ADDRESS REGISTER 0.3 @ 0.3 2 2 
E=259 VFCLI1 VECTOR FUNCTION CALLER (1 ARGUMENT) 0.8 1.9 10 19 
E-260 VFCL2 VECTOR FUNCTION CALLER (2 ARGUMENT) 1.0 1.0 ll ll 
E-261 BITREV COMPLEX VECTOR BIT REVERSE ORDERING 0.9 1.4 45 43 
E~262 REALTR REAL FFT UNRAVEL AND FINAL PASS 0.4 0.7 68 68 
E=-263 FFT2 RADIX 2 FFT FIRST PASS 1.3 2.7 16 16 
E-264 FFT4 RADIX 4 FFT PASS 347 5.3 79 79 
E-265 FFT2B RADIX 2 FFT FIRST PASS + BIT REVERSE 1.3 267 25 2 

E~-266 FFT4B RADIX 4 FFT FIRST PASS + BIT REVERSE 2.7 5.3 43 43 
E-267 STSTAT SET FFT MODE STATUS BITS 3-0 @ 5.0 19 19 
E-268 CLSTAT CLEAR FFT MODE STATUS BITS 0.5 @ 0.5 19 19 
E-269 ILOG2 LOGARITHM (BASE 2) 4.0 @ 4.0 19 19 
E=270 ADV2 ADVANCE POINTERS AFTER RADIX 2 FFT 0.7 @0.7 7 7 
E=-271 ADV4 ADVANCE POINTERS AFTER RADIX 4 FFT 0.7 @ 0.7 7 i 
E~272 SET24B SETUP FOR FFT2B AND FFT4B 1.2 @ 1.2 8 8 
EeZ/3. XCFFT EXPANDED COMPLEX FFT 0.32% 0.42 187 187 
E-275 XRFFT EXPANDED REAL FFT 0-19* 0.28 230. +256 
E-277 XBITRE EXPANDED BIT REVERSE 3.7 3-7 44 44 
E-278 XREALT EXPANDED REAL FFT FINAL PASS 0.4 0.7 71 71 
E=-279 PCFFT PARTIAL COMPLEX FFT 1.05* 1.50 it 117 
E~-280 X¥FFT4 EXPANDED RADIX 4 FFT PASS 3.7 5.3 79 79 
E-281 CTOR COMPLEX TO REAL FFT UNSCRAMBLE 0-13* 0.13 80 80 
E=282 RTOC REAL TO COMPLEX FFT SCRAMBLE 0.09* 0.09 143) 143 
E~284 SSDA SINGLE + SINGLE TO DOUBLE ADD 1.5 @1.5 10 10 
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Name 


5SDM 
SDDA 
DDDA 
DDDM 


#. # 


* 
@ 


Operation 


SINGLE * SINGLE 
SINGLE + DOUBLE 
DOUBLE + DOUBLE 
DOUBLE * DOUBLE 


TO DOUBLE 
TO DOUBLE 
TO DOUBLE 
TO DOUBLE 


Timing host system dependent 
Refer to description of routine for explanation of timing 
Total execution time 


MULTIPLY 
ADD 
ADD 
MULTIPLY 


Typical 
Execution 
Time/Loop 

(us ) 


Loz: |:-333 
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tr 


te 
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Program 
Size 
(AP 

PS words) 


48 48 
Livy. Lily 
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APPENDIX C ABBREVIATED DESCRIPTIONS OF ROUTINES 


Routine Purpose 


DATA TRANSFER AND CONTROL OPERATIONS (APEX) 


E=4 CALL APPUT (HOST,AP,N, TYPE) PUT DATA INTO THE AP 


HOST = 


i od ia 


An array name, array element, To transfer data from the host 
variable, or constant which computer memory into the AP 
specifies the initial host data main data memory. 

element to be transferred. 

An integer constant, variable, or 

expression which specifies the base 

address in AP main data memory 

into which data is to be transferred. 

Element count (AP data words) 

An integer specifying the host data 

type and format conversion during 

data transfer to the AP. 


0 32-bit integers. Stored without 
format conversion into the low 
32=-bits (bits 8-39) of AP 
main data memory words. 


l l6=bit integers. Converted into un- 
normalized AP floating-point 
numberse These numbers must be 
normalized (using VFLT) before they 
can be processed by the AP. 


nN 


Host single-precision (real) floating- 
point numbers. Converted "on the fiy™ 
to normalized AP floating-point 
numbers. 


3 IBM 360 32-bit format floating-point 
numbers. Converted "on the fly” 
to normalized AP floating-point 
numbers. 


E-6 CALL APGET (HOST,AP,N,TYPE) GET DATA FROM THE AP 


HOST = 


N = 


An array name, array element, To transfer data from the AP 
variable, or constant which main data memory into the host 
specifies the initial host memory computer memory. 

location to receive transferred data. 


_An integer constant, variable, or 


expression which specifies the base 
address in AP main data memory 

from which data is to be transferred. 
Element count (AP data words) 
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Page Routine Purpose 


TYPE = An integer specifying the host data 
type and format conversion during 
data transfer from the AP. 


QO 32=bit integers. The low 32 bits 
(bits 8-39) of AP memory words 
are transferred without format 
conversion into the host memory. 


l l6—bit integers. The low 16 bits 
(bits 24-39) of AP memory words 
are stored into host integer 
locations. 


tO 


AP floating-point numbers are 
converted "on the fly" into host 
single-precision (real) floating- 
point numbers. , 


3. AP floating-point numbers are 
converted "on the fly" into IBM 360 
32-bit format floating-point numbers 
in the host. 

E-8 CALL APCLR INITIALIZE THE AP 
To initialize the AP by 
clearing the hardware status 
and initializing APEX. 
E~9 CALL APWD = WAIT FOR AP DATA TRANSFER 
To delay host program execution 
until any previously initiated 
data transfer between the host 
and the AP has been completed. 
E-10 CALL APWR WAIT FOR AP PROGRAM EXECUTION 
To delay host program execution 
until any previously initiated 
AP program has been completed. 
E-l1 CALL APWAIT WAIT FOR AP 
To delay host program execuzion 
until the AP is done 
transferring data and executing 
a program. 


E~12 CALL APGSP(I,NREG) 
I = Value contained in S-Pad register 
NREG = $=Pad register number (1 to 15) 


E-13 CALL APCHK(IERR) 


IERR = Error information from AP 
program. 
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a 


READ AN AP S=PAD REGISTER 
To read the contents of an AP 
S-Pad register. 


CHECK AP PROGRAM ERROR CONDITION 
To check error information 
returned by certain AP Math 
Library programs. 


Page 


B-14 


Routine 


CALL APSTAT (IERR, ISTAT) 


IERR = Set to 1 if hardware error 


detected, 0 otherwise 


Purpose 


GET AP HARDWARE STATUS 


To read the AP status and DMA 
control registers. 


ISTAT = A 4-element array to delineate 
the error conditions as follows: 
ISTAT(1) = Arithmetic overflow 
ISTAT (2) = Arithmetic underflow 
ISTAT(3) = Divide by zero 
ISTAT(4) = Format conversion 
overflow/underflow 


E-16 CALL VCLR(C,K,N) VECTOR CLEAR 
C = Destination vector base address To clear elements of a vector. 
K = C address increment 
N = Element count 

E-17 CALL VMOV(A,I,C,K,N) VECTOR MOVE 
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E-18 CALL VSWAP(A,I,C,K,N) VECTOR SWAP 
A = Vector base address To swap data between two 
I = A address increment vectors. 
C = Vector base address 
K = C address increment 
N = Element count 
E-19 CALL VFILL(A,C,K,N) VECTOR FILL 
A = Address of constant value To fill elements of a vector 
C = Destination vector base address with a constant. 
K = C address increment 
N = Element count 
E-20 CALL VRAMP(A,B,C,K,N) VECTOR RAMP 
A = Address of initial ramp value To fill elements of a vector 
B = Address of ramp increment with a ramp function. 
C = Destination vector base address 
K = C address increment 
N = Element count 
E-21 CALL VNEG(A,I,C,K,N) VECTOR NEGATE 
A = Source vector base address To negate elements of a vector. 
I = A address increment 
C = Destination vector base address 
K = C address increment 
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Source vector base address 

A address increment 
Destination vector base address 
C address increment 

Element count 
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To move elements of a vector 
from one location to another. 
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Routine 


= Element count 


CALL VADD (A,I,B,J,C,K.N) 
= Source vector base address 
= A address increment 
= Source vector base address 
= B address increment 
= Destination vector base address 
= C address increment 
= Element count 


CALL VSUB(A,1I,B,J,C,K,N) 
= Source vector base address 
= A address increment 
= Source vector base address 
= B address increment 
= Destination vector base address 
= C address increment 
= Element count 


CALL VMUL(A,I,B,J,C,K,N) 
= Source vector base address 
= A address increment 
= Source vector base address 
B address increment 
Destination vector base address 
C address increment 
Element count 


CALL VDIV(A,I,B,J,C,K,N) 
= Source vector base address 
= A address increment 
Source vector base address 
B address increment 
Destination vector base address 
= C address increment 
= Element count 


on 


CALL VSADD (A,I,B,C,K,N) 
= Source vector base address 
= A address increment 
= Scalar address 
= Destination vector base address 
= C address increment 
= Element count 


CALL VSMUL (A,I,B,C,K,N) 
= Source vector base address 
= A address increment 
= Scalar address 
= Destination vector base address 
= C address increment 
= Element count 
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Purpose 


VECTOR ADD 
To add the elements of two 
vectors. 


VECTOR SUBTRACT 
To subtract tne elements of two 
vectors. 


VECTOR MULTIPLY 
To multiply the elements of two 
vectors. 


VECTOR DIVIDE 
To divide the elements of two 
vectors. 


VECTOR SCALAR ADD 
To add a scalar to the elements 
of:a vector. 


VECTOR SCALAR MULTIPLY 
To multiply the elements of a 
vector by a scalar. 
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Routine 


CALL VTSADD (A,I,B,C,K,N) 
Source vector base address 

A address increment 

Scalar address (Table Memory) 


= Destination vector base address 


C address increment 
Element count 


CALL VISMUL(A,1I,8,C,K,N) 
Source vector base address 

A address increment 

Scalar address (Table Memory) 


= Destination vector base address 


C address increment 
Element count 


CALL VSQ(A,1I,C,K,N) 

Source vector base address 

A address increment 
Destination vector base address 
C address increment 

Element count 


CALL VSSQ(A,1I,C,K,N) 

Source vector base address 

A address increment 
Destination vector base address 
C address increment 

Element count 


CALL VABS (A,1I,C,K,N) 

Source vector base address 

A address increment 
Destination vector base addr2ss 
C address increment 

Element count 


CALL VSQRT(A,I,C,K,N) 

Source vector base address 

A address increment 
Destination vector base address 
C address increment 

Element count 


CALL VLOG(A,I,C,K,N) 

Source vector base address 

A address increment 
Destination vector base address 
C address increment 

Element count 


CALL VLN(A,1I,C,K,N) 
Source vector base address 
A address increment 
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Purpose 


VECTOR TABLE SCALAR ADD 
To add a table memory scalar to 
the elements of a vector. 


VECTOR TABLE SCALAR MULTIPLY 
To multiply the elements of a 
vector by a table memory 
scalar. 


VECTOR SQUARE 
To square the elements of a 
vector. 


VECTOR SIGNED SQUARE 
To multiply each element of a 
vector by the absolute value of 
that element. 


VECTOR ABSOLUTE VALUE 
To take the absolute value of 
the elements of a vector. 


VECTOR SQUARE ROOT 
To take the square root of the 
elements of a vector. 


VECTOR LOGARITHM (BASE 10) 
To take the logarithm (base 10) 
of the elements of a vector. 


VECTOR NATURAL LOGARITHM 
To take the natural logarithm 
of the elements of a vector. 
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E~39 
A = 
[ = 
C =x 
K = 
N = 

E-40 
A = 
Ts 
C = 
K = 
N 23 

E-41 
A #2 
[—~ = 
B 
J 
C = 
K = 
N = 

E-42 
A = 
C= 
K = 
N = 


Routine 


Destination vector base address 
C address increment 
Element count 


CALL VALOG(A,I,C,K,N) 
Source vector base address 
A address increment 


= Destination vector base address 


C address increment 
Element count 


CALL VEXP (A,I,C,K,N) 

Source vector base address 

A address increment 
Destination vector base address 
C address increment 

Element count 


CALL VSIN(A,I,C,K,N) 

Source vector base address 

A address increment 
Destination vector base address 
C address increment 

Element count 


CALL VCOS (A,I,C,K,N) 

Source vector base address 

A address increment 
Destination vector base address 
C address increment 

Element count 


CALL VATAN(A,1,C,K,N) 

Source vector base address 

A address increment 
Destination vector base address 
C address increment 

Element count 


CALL VATN2(A,1,B,J,C,K,N) 
Source vector base address 
A address increment 


= Source vector base address 
= B address increment 


Destinatigqn vector base address 
C address increment 
Element count 


CALL VRAND (A,C,K,N) 

Address of random number seed 
Destination vector base address 
C address increment 

Element count 
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Purpose 


VECTOR ANTILOGARITHM (BASE 10) 
To take the antilogarithm of 
the elements of a vector. 


VECTOR EXPONENTIAL 
To take the exponential of the 
elements of a vector. 


VECTOR SINE 
To compute the sine of the 
elements of a vector. 


VECTOR COSINE 
To compute the cosine of the 
elements of a vector. 


VECTOR ARCTANGENT 
To take the arctangent of the 
elements of a vector. 


VECTOR ARCTANGENT OF Y/X 
To take the arctangent of the 
ratio of the elements of two 
vectors. 


VECTOR RANDOM NUMBERS 
To fill elements of a vector 
with random numbers. 
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Routine 


CALL VMSA (A,1,B,J,C,D,L,N) 
Source vector base address 
A address increment 

Source vector base address 
B address increment 

Scalar address 

Result vector base address 


= D address increment 


Vector length 


CALL VSMA(A,I,B,C,K,D,L,N) 
Source vector base address 
A address increment 

Scalar address 

Source vector base address 
C address increment 


Destination vector base address 


D address increment 
Element count 


CALL VSMSB(A,1I,B,C,K,D,L,N) 
Source vector base address 


= A address increment 


Scalar address 


= Source vector base address 
= C address increment 


Destination vector base address 


D address increment 
Element count 


CALL VMA(A,1I,B,J,C,K,D,L,N) 


= Source vector vase address 


A address increment 
Source vector base address 
B address increment 
Source vector base address 
C address increment 


Destination vector base address 


D address increment 
Element count 


CALL VMSB(A,1I,B,J,C,K,D,L,N) 


Source vector base address 
A address increment 
Source vector base address 
B address increment 
Source vector base address 
C address increment 


Destination vector base address 


D address increment 
Element count 


CALL VAM(A,1I,B,J,C,K,D,L,N) 
Source vector base address 
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Purpose 


VECTOR MULTIPLY AND SCALAR ADD 
To multiply the elements of two 
vectors and add a scalar to the 
products. 


VECTOR SCALAR MULTIPLY AND ADD 
To multiply the elements of a 
vector by a scalar and add a 
second vector to the products. 


VECTOR SCALAR MULTIPLY AND SUBTRACT 
To multiply the elements of a 
vector by a scalar and subtract 
a second vector from the 
products. 


VECTOR MULTIPLY AND ADD 
To multiply the elements of two 
vectors, and add the products 
to a third vector, i-e., 
D=(A*B)+C. 


VECTOR MULTIPLY AND SUBTRACT 
To multiply the elements of two 
vectors, and subtract a third 
vector from the products, i.e., 
D=(A*B)-C. 


VECTOR ADD AND MULTIPLY 
To add the elements of two 
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Routine 


= A address increment 

= Source vector base address 

= B address increment 

= Source vector base address 

= C address increment 

= Destination vector base address 
= PD address increment 

= Element count 


CALL VSBM(A,1I,B,J,C,K,D,L,N) 
= Source vector base address 
= A address increment 
= Source vector base address 
= 3 address increment 
= Source vector base address 
= C address increment 
= Destination vector base address 
= D address increment 
= Element count 


CALL VSMSA(A,1I,B,C,D,L,N) 
= Source vector base address 
= A address increment 
= Multiplying scalar address 
= Adding scalar address 
= Destination vector base address 
= DPD address increment 
= Element count 


CALL VMMA(A,1,B,J,C,K,D,L,E,M,N) 
= Source vector base address 
= A address increment 
= Source vector base address 
= B address increment 
= Source vector base address 
= C address increment 
= Source vector base address 
= D address increment 
= Destination vector base address 
= EF address increment 
= Element count 


CALL VMMSB(A,1,3,J,C,%,D,L,E,M,N) 
= Source vector base address 
= A address increment 
Source vector base address 
B address increment 
Source vector base address 
C address increment 
= Source vector base address 
D address increment 
= Destination vector base address 
= E address increment 
= Element count 
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Purpose 


vectors, and multiply the sum 
by a third vector, i-e., 
D= (A+B) *C. 


VECTOR SUBTRACT AND MULTIPLY 
To subtract the elements of two 
vectors, and multiply the 
difference by a third vector, 
iee., D=(A-B)*C. 


VECTOR SCALAR MULTIPLY AND SCALAR ADD 
To multiply the elements of a 
vector by a scalar and adda 
second scalar to the products. 


VECTOR MULTIPLY, MULTIPLY, AND ADD 
To multiply the elements of two 
vectors, multiply the elements 
of a second set of two vectors, 
and add the two product 
vectors, ieee» E=(A*B)+(C*D). 


VECTOR MULTIPLY MULTIPLY AND SUBTRACT 
To multiply the elements of two 
vectors, multiply the elements 
of a second set of two vectors, 
and subtract the two product 
vectors, iee. E=(A*B)-(C*D). 
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Routine 


CALL. VAAM (CA; 1,85 3;C, 5,05 GVE,MsN) 
Source vector base address 


= A address increment 


> 


Source vector base address 

B address increment 

Source vector base address 

C address increment 

Source vector base address 

D address increment . 
Destination vector base address 
E address increment 

Element count 


CALL VSBSBM(A,1,B,J,C,K,D,L,E,M,N) 
Source vector base address 

A address increment 

Source vector base address 


= B address increment 


Source vector base address 
C address increment 


= Source vector base address 
= D address increment 


in 


Hou 


Destination vector base address 
E address increment 
Element count 


CALL VAND (A,1I,B,J,C,K,N) 
Source vector base address 
A address increment 


= Source vector base address 
= B address increment 
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Destination vector base address 
C address increment 
Element count 


CALL VEQV(A,1I,B,J,C,K,N) 


= Source vector base address 


A address increment 

Source vector base address 

B address increment 

Destination vector base address 
C address increment 


= Element count 


ul 


CALL VOR(A,1,B,J,C,K,N) 

Source vector base address 

A address increment 

Source vector base address 

B address increment 
Destination vector base address 
C address increment 

Element count 


860-7288-004 C 


Purpose 


VECTOR ADD, ADD, AND MULTIPLY 
To add the elements of two 
vectors, add the elements of a 
second set of two vectors, and 
multiply the two sum vectors, 
Lee. E=(A+B)*(C+D). 


VECTOR SUBTRACT SUBTRACT AND MULTIPLY 
To subtract the elements of two 
vectors, subtract the elements 
of a second set of two vectors, 
and multiply the two difference 
vectors, i.-e. E=(A-B)*(C=D). 


VECTOR LOGICAL AND 
To logically AND the elements 
of two vectors. 


VECTOR LOGICAL EQUIVALENCE 
To logically EQUIVALENCE the 
elements of two vectors. 


VECTOR LOGICAL OR 
To logically OR the elements of 
two vectors. 
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VECTOR-TO-SCALAR OP 


E-60 
A = 
B = 
J = 
C = 
K = 
N = 

E-62 
A= 
~~ = 
C = 
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A = 
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C = 
N = 

E-64 
A = 
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C= 
N = 

E-65 
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[~ = 
C 
N 

E-66 
A = 
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Routine 


CALL VFRAC(A,1I,C,K,N) 


= Source vector base address 
= A address increment 


Destination vector base address 
C address increment 
Element count 


CALL. VINT A, 1,C,K5N) 

Source vector base address 

A address increment 
Destination vector base address 
C address increment 

Element count 


CALL VINDEX(A,B,J,C,K,N) 

Source vector base address 
Index vector base address 

B address increment 
Destination vector base address 
C address increment 

Element count 


CALL SVE(A,I,C,N) 

Source vector base address 
A address increment 
Destination scalar address 
Element count 


CALL SVEMG(A,I,C,N) 


= Source vector base address 


A address increment 
Destination scalar address 
Element count 


CALL SVESQ(A,1,C,N) 

Source vector base address 
A address increment 
Destination scalar address 
Element count 


CALL SVS(A,1,C,N) 
Source vector base address 
A address increment 


= Destination scalar address 
= Element count 


CALL DOTPR(A,1I,B,J,C,N) 
Source vector base address 
A address increment 
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Purpose 


VECTOR TRUNCATE TO FRACTION 
To truncate the elements of a 
vector to their fractional 
parts. 


VECTOR TRUNCATE TO INTEGER 
To truncate the elements of a 
vector to integer floating 
point numbers. 


VECTOR INDEX 
To form a vector by using the 
elements of one vector as the 
addresses by which to select 
the elements of a second 
vector. 


ERATIONS 


SUM OF VECTOR ELEMENTS 
To sum the elements of a 
vector. 


SUM OF VECTOR ELEMENT MAGNITUDES 
To sum the absolute values of 
the elements of a vector. 


SUM OF VECTOR ELEMENT SQUARES 
To sum the squares of the 
elements of a vector. 


SUM OF VECTOR SIGNED SQUARES 
To sum the signed squares of 
the elements of a vector. 


DOT PRODUCT 
To compute the dot product of 
the elements of two vectors. 
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Source vector base address 
B address increment 
Destination scalar address 
Element count 


CALL MAXV(A,1I,C,N) 

Source vector base address 

A address increment 

Destination scalar address 
(2 words required) 

Element count 


CALL MINV(A,I,C,N) 

Source vector base address 

A address increment 

Destination scalar address 
(2 words required) 

Element count 


CALL MAXMGV (A,I,C,N) 
Source vector base address 
A address increment 
Destination scalar address 

(2 words required) 
Element count 


CALL MINMGV(A,I,C,N) 
Source vector base address 
A address increment 
Destination scalar address 

2 words required) 
Element count 


CALL MEANV(A,I,C,N) 

Source vector base address 
A address increment 
Destination scalar address 
Element count 


CALL MEAMGV(A,I,C,N) 
Source vector base address 
A address increment 
Destination scalar address 
Element count 


CALL MEASQV(A,I,C,N) 
Source vector base address 
A address increment 
Destination scalar address 
Element count 


CALL RMSQV(A,I,C,N) 
Source vector base address 
A address increment 
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Purpose 


MAXIMUM ELEMENT IN VECTOR 
To scan a vector for its 
maximum element. 


MINIMUM ELEMENT IN VECTOR 
To scan a vector for its 
minimum element. 


MAXIMUM MAGNITUDE ELEMENT IN VECTOR 
To scan a vector for its 
maximum magnitude (absolute 
value) element. 


MINIMUM MAGNITUDE ELEMENT IN VECTOR 
To scan a vector for its 
minimum magnitude (absolute 
value) element. 


MEAN VALUE OF VECTOR ELEMENTS 
To compute the mean (average) 
value of the elements of a 
vector. 


MEAN OF VECTOR ELEMENT MAGNITUDES 
To compute the mean (average) 
value of the absolute values of 
the elements of a vector. 


MEAN OF VECTOR ELEMENT SQUARES 
To compute the mean (average) 
value of the squares of the 
elements of a vector. 


ROOT-MEAN-SQUARE OF VECTOR ELEMENTS 
To compute the square root of 
the mean (average) value of the 
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Routine 


= Destination scalar address 
= Element count 


CALL VMAX(A,1I,B,J,C,K,N) 

Source vector base address 

A address increment 

Source vector base address 

B address increment 
Destination vector base address 
C address increment 

Element count 


CALL VMIN(A,1I,B,J,C,K,N) 

Source vector base address 

A address increment 

Source vector base address 

B address increment 
Destination vector base address 
C address increment 

Element count 


CALL VMAXMG (A,1,B,J,C,K,N) 
Source vector base address 

A address increment 

Source vector base address 

B address increment 

Destination vector base address 
C address increment 

Element count 


CALL VMINMG(A,1I,B,J,C,K,N) 
Source vector base address 

A address increment 

Source vector base address 

B address increment 
Destination vector base address 
C address increment 

Element count 


CALL VCLIP(A,1I,B,C,D,L,N) 
Source vector base address 

A address increment 

Address of smaller scalar 
Address of larger scalar 
Destination vector base address 
D address increment 

Element count 


CALL VICLIP(A,1I,B,C,D,L,N) 
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Purpose 


squares of the elements of a 
vector. 


VECTOR MAXIMUM 
To form a vector from the 
maximum value of each 
corresponding pair of elements 
of two vectors. 


VECTOR MINIMUM 
To form a vector from the 
minimum value of each 
corresponding pair of elements 
of two vectors. 


VECTOR MAXIMUM MAGNITUDE 
To form a vector from the 
maximum absolute value of each 
corresponding pair of elements 
of two vectorse 


VECTOR MINIMUM MAGNITUDE 
To form a vector from the 
minimum absolute vaiue of each 
corresponding pair of elements 
of two vectors. 


VECTOR CLIP 
To clip the values of a vector 
to within a specified range. 


VECTOR INVERTED CLIP 


i> 


Routine 


Source vector base address 
A address increment 
Address of smaller scalar 


= Address of larger scalar 


Destination vector base address 
D address increment 
Element count 


CALL VEIN CAST BCD oy) 
Source vector base address 

A address increment 

Address of scalar to compare 
with source 


Address of destination magnitude 


. 


scalar 


= Destination vector base address 


D address increment 
Element count 


CALL LVGT(A,1I,B,J,C,K,N) 


= Source vector base address 


A address increment 
Source vector base address 


= B address increment 


Destination vector base address 
C address increment 
Element count 


CALL LVGE (A, 1, 850,C, KN) 

Source vector base address 

A address increment 

Source vector base address 

B address increment 
Destination vector base address 
C address increment 

Element count 


CALL LVEQCAS TBs, OpRi Nn) 

Source vector base address 

A address increment 
Source vector base address 

B address increment 

Destination vector base address 


= C address increment 


Element count 


CALL LVNE (A,I,B,J,C,K,N) 

Source vector base address 

A address increment 

Source vector base address 

B address increment 
Destination vector base address 
C address increment 

Element count 
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Purpose 


To exclude values of a vector 
from within a specified range. 


VECTOR LIMIT 
To create a vector limited to a 
Single value in magnitude, 
where the sign of each element 
depends on whether the 
corresponding element of a 
second vectcr exceeds a certain 
value. 


LOGICAL VECTOR GREATER THAN 
To compare the elements of two 
vectors A and B and output a 
vector C such that: 
C(mK)=1.0 if A(mI)>B(mJ) 
C(mK)=0.0 if A (mI )=<B(mJ) 


LOGICAL VECTOR GREATER THAN OR EQUAL 


To compare the elements of two 
vectors A and B and output a 
vector C such that: 

C(mK)=1.0 if A(mI)>=B (mJ) 
C(mK)=0.0 if A(mI)<B(mJ) 


LOGICAL VECTOR EQUAL 
To compare the elements of two 
vectors A and B and output a 
vector C such that: 
C(mK)=1.0 if A(mI)=B (mJ) 
C(mK)=0.0 if A(mI)not=B(mJ) 


LOGICAL VECTOR NOT EQUAL 
To compare the elements of two 
vectors A and B and output a 
vector C such that: 
C(mK)=1.0 if A(mI)not=B (mJ) 
C(mK)=0.0 if A(mI)=B(mJ) 
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Routine 


CALL LYNOT(A,1L,C,K,N) 

Source vector base address 

A address increment 

Destination vector base address 
C address increment 


= Element count 


CALL VLMERG(A,1,B,J,C,K,D,L,N) 


= Source vector base address 
= A address increment 


Source vector base address 
B address increment 
Source vector base address 


= C address increment 


Destination. vector base address 
D address increment 
Element count 


PS A ON SS AD DS OE CD AD ED SS ENED NS AE AD RE AS NE OND SE US SN A GE GO ee 


COMPLEX VECTOR ARITHMETIC 


Lo AS SS A A NE A SD A SD ED NS SS AS TS SS NS TD OID GS GE SE SSM GAAS RD ESS OS 


CALL CVMOV(A,I,C,K,N) 

Source vector base address 

A address increment 

Destination vector base address 
C address increment 


= Element count 


CALL CVFILL (4,C,K,N) 


= Complex constant base address 


Destination vector base address 
C address increment 


= Complex element count 


CALL CVCOMB(A,1I,B,J,C,K,N) 


= Real source vector base address 


A address increment 

Imaginary source vector base address 
B address increment 

Destination vector base address 

C address increment 

Element count 


CALL CVREAL(A,1I,C,K,N) 


= Real source vector base address 
= A address increment 


Destination vector base address 
C address increment 
Element count 


CALL VREAL(A,I,C,K,N) 
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Purpose 


LOGICAL VECTOR NOT 
To examine the elements of a 
vector A and output a vector 
such that: 
C(mK)=1.0 if A(mI)=0.0 
C(mK)=0.0 if A(mI)not=0.0 


VECTOR LOGICAL MERGE 
To examine the elements of 
three vectors,A,B,and C and 
output a vector D such that; 
D(mL)#A(mI) if C(mK)not=0.90 
D(mL)=B(mJ) if C(mK)=0.0 


0 AOS OED Cee RS ND ND EE SS A SD ED ee ET SOLD CE COD US GE OO OE 


COMPLEX VECTOR MOVE 
To move the elements of a 
complex vector from one 
location to another. 


COMPLEX VECTOR FILL 
To fill the elements of a 
complex vector with a complex 
constant. 


COMPLEX VECTOR COMBINE 
To form a compiex vector by 
combining two real vectors. 


FORM COMPLEX VECTOR OF REALS 
To form a complex vector by 
combining a real vector and 
zeroing the imaginaries. 


EXTRACT REALS OF COMPLEX VECTOR 
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Routine 


Complex source vector base address 
A address increment 


= Real destination vector base address 


C address increment 
Element count 


CALL VIMAG(A,I,C,K,N) 

Complex source vector base address 
A address increment 

Real destination vector base address 
C address increment 

Element count 


CALL CVNEG(A,I,C,K,N) 

Source vector base address 

A address increment 
Destination vector base address 
C address increment 

Element count 


CALL CVCONJ(A,1I,C,X,N) 


Source vector base address 

A address increment 
Destination vector base address 
C address increment 

Complex element count 


CALL. CVADD (A, 1,8,J,;C,K,N) 
Source vector base address 
A address increment 

Source vector base address 
B address increment 


= Destination vector base address 
= C address increment 
= Element count 


CALL. CVSUBTA, 1.85.35 Cy Ko) 
Source vector base address 

A address increment 

Source vector base address 

B address increment 
Destination vector base address 
C address increment 

Element count 


CALL CVMUL(A,I,B,J,C,K,N,F) 
Source vector base adress 

A address increment 

Source vector base address 

B address increment 
Destination vector base address 
C address increment ~ 


= Complex element count 
= Conjugate flag, 
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Purpose 


To form a real vector by 
extracting the real pvarts from 
a complex vector. 


EXTRACT IMAGINARIES OF COMPLEX VECTOR 
To form a real vector by 
extracting the imaginary parts 
from a compiex vector. 


COMPLEX VECTOR NEGATE 
To negate the elements of a 
complex vector. 


COMPLEX VECTOR CONJUGATE 
To conjugate the elements of a 
complex vector. 


COMPLEX VECTOR ADD 
To add the elements of 
complex vectors. 


two 


COMPLEX VECTOR SUBTRACT 
To subtract the elements of two 
complex vectors. 


COMPLEX VECTOR MULTIPLY 
To multiply the elemenrcs of two 
complex vectors. 
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= Source vector base address 


= Element 


Routine 


+1 = normal complex multiply 
-l1 = multiply with conjugate of A 


CALL CVSMUL(A,1,3,C,K,%) 

Source vector base address 

A address increment 

Scalar address 

Destination vector base address 


= C address increment 


Element count 


CALL CVRCIP(A,1,C,5,0) 
Source vector base address 


= A address increment 
= Destination vector base address 


C address increment 


= Complex element count 


CALL CRVADD (A,1,3,J,C,K,N) 
Source vector base address (complex) 


= A address increment 


Source vector base address (real) 


= B address increment 


Destination vector base address 


= C address increment 
= Element count 


CALL CRVSUB(A,1.3,J,C,K,N) 

Source vector base address (complex) 
A address increment 

(real) 

B address increment 


= Destination vector base address 


C address increment 
count 


CALL CRVMUL (A,1,3,J,C,&,N) 


= Source vector base address (complex) 


A address increment 
Source vector base address (real) 
B address increment 


= Destination vector base address 
= C address increment 


Element count 


CALL CRVDIV(A,1,B,J,C,K,N) 


= Source vector base address (complex) 


A address increment 

Source vector base address (real) 
B address increment 

Destination vector base address 

C address increment 

Element count 
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Purpose 


COMPLEX VECTOR SCALAR MULTIPLY 
To multiply the elements of a 
complex vector by a real 
scalar. 


COMPLEX VECTOR RECIPROCAL 
To obtain reciprocal of a 
complex vector. 


COMPLEX AND REAL VECTOR ADD 
To add the elements of a 
complex vector to the elements 
of a real vector. 


COMPLEX AND REAL VECTOR SUBTRACT 
To subtract the elements of a 
real vector from the elements 
of a complex vector. 


COMPLEX AND REAL VECTOR MULTIPLY 
To multiply the elements of a 
complex vector by the elements 
of a real vector. 


COMPLEX AND REAL VECTOR DIVIDE 
To divide the elements of a 
complex vector by the elements 
of a real vector. 
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Routine 


CALL CVMA(A,1,B,J,0,K,D,L,N,F) 
Source vector base address 

a address increment 

Source vector base address 

B address increment 


= Source vector base address 
= C address increment 


Destination vector base address 
D address increment 
Complex element count 
Conjugate flag, 
+1 = normal complex multiply 
-l = multiply with conjugate of A 


CALL CVMAGS (A,I,C,K,N) 

Source vector base address 

A address increment 
Destination vector base address 
C address increment 

Complex element count 


CALL SCJMA(A,1I,B,J,C,K,N) 

Source complex vector base address 
A address increment 

Source real vector base address 

B address increment 

Destination real vector base address 
C address increment 

Complex element count 


CALL POLAR (A,1I,C,K,N) 

Source vector base address 

A address increment 
Destination vector base address 
C address increment 

Complex element count 


E-112 CALL RECT(A,I,C,K,N) 
A = Source vector base address 
I = A address increment 
C = Destination vector base address 
K = C address increment 
N = Complex element count 
E-113 CALL CVEXP (A,I,C,K,N) 
A = Source vector base address 
I = A address increment. 
C = Destination vector base address 
K = C address increment 
N = Element count 
E~114 CALL CVMEXP (A,1,B,J,C,K,N) 
A = Source vector base address 
I = A address increment 
FPS 860-7288-004 , Cc 


Purpose 


COMPLEX VECTOR MULTIPLY AND ADD 
To multiply the elements of two 
complex vectors, and add the 
products to a third complex 
vector. 


COMPLEX VECTOR MAGNITUDE SQUARED 
To compute the squared 
magnitude of the elements of a 
complex vector. 


SELF<CONJUGATE MULTIPLY AND ADD 
To multiply the elements of a 
complex vector by the conjugate 
of that vector (squared 
magnitude), and add the real 
products to a real vector. 


RECTANGULAR TO POLAR CONVERSION 
To convert a complex vector 
from rectangular to polar form. 


POLAR TO RECTANGULAR CONVERSION 


To convert a complex vector 
from polar to rectangular form. 


COMPLEX VECTOR EXPONENTIAL 
To calculate the complex 
exponential exp(iX)= 
COS (X)+iSIN(X). 


VECTOR MULTIPLY COMPLEX EXPONENTIAL 


To multiply a real vector by a 
complex exponential. 
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B = Source vector base address 
J = B address increment 
C = Destination vector base address 
K = C address increment 
N = Complex element count 


td 
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A 


CALL. CDOTPR(A,1,B,J,C,8) 
Source vector base address 
A address increment 
Source vector base address 
B address increment 
Destination scalar address 
Complex element count 


me 
fl in 


Li? CALL. VELT CASI Cy RN) 

A = Source vector base address 

I = A address increment 

C = Destination vector base address 
K = C address increment © 

N = Element count 


E-118 CALL VFIX(A,I,C,K,N) 


ri 
i 
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A = Source vector base address 

I = A address increment 

C = Destination vector base address 
K = C address increment © 

N = Element count 


CALL VSMAFX(A,2=,3,C,30, LN) 
= Source vector base address 
= A address increment 

= Multiplying scalar address 
= 

= 


ie) 
oOo 


Adding scalar address 
Destination vector base address 
= D address increment 
= Element count 


21 CALL VSCALE(A,1I,B,C,X,N,NB) 

= Source vector base address 

A address increment 

Scalar base address 
Destination vector base address 
C address increment 

Element count 


il 


AOD H re 
“ou 


a 
i 


NB = Desired width (2 to 28 bits) of 


integers, including sign bit 


E-123 CALL VSCSCL(A,I,C,K,N,NB) 


A = Source vector base address 


FPS 860-7288-004 


Purpose 


COMPLEX DOT PRODUCT 


To compute the complex dot 
product of two complex vectors. 


VECTOR INTEGER FLOAT 


To convert a vector of integers 
to a vector of floating-point 
numbers. 


VECTOR INTEGER FIX 


To fix to integers the elements 
of a floating-point vector. 


VECTOR SCALAR MULTIPLY, ADD, AND FIX 


To multiply the elements of a 
vector by a scalar, adda 
second scalar to the products, 
and fix the resulting sums to 
integers. 


VECTOR SCALE (POWER 2) AND FIX 


To scale the elements of a 
vector by a power of 2 such 
that a selected scalar will 
just fit into a specified 
integer bit width, and then fix 
the scaled elements to 
integers. 


VECTOR SCAN, SCALE (POWER 2) AND FIX 


To scale the elements of a 
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NB = Desired width (2 to 28 bits) of 


ay 
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tw 


FPS 


Routine 


= A address increment 

= Destination vector base address 
= C address increment. 

= Element count 


integers, including sign bit 


5 CALL VSHFX(A,1,C,K,N,NS) 
= Source vector base address 
= A address increment 
= Destination vector base address 
= C address increment 
= Element count 
= Power of 2 (may be negative) 


CALL VYUP8(A,I,C,K,N) 

Source vector base address 

= A address increment 

= Destination vector base address 
C address increment 

= Element count (source words) 


i ON 


7 CALL VUPS8(A,1I,C,K,N) 
= Source vector base address 


= A address increment 


= Destination vector base address 
= C address increment 
= Element count (source words) 


CALL VPK8(A,1I,C,K,N) 

Source vector base address 

= A address increment 
Destination vector base address 
= C address increment 


ti oo 


ii 


= Element count (destination words) 


9 CALL VWUP1L6(A,I,C,K,N) 

= Source vector base address 

= A address increment 

= Destination vector base address 
= ¢€ address increment 

= Element count (source words) 


0 CALL VUPS16(A,1I,C,K,N) 

= Source vector base address 

= A address increment 

= Destination vector base address 
= C address increment 

= Element count (source words) N 


1 CALL VPK16(A,1I,C,K,N) 

= Source vector base address 

= A address increment 

= Destination vector base address 


860-7288-004 


Purpose 


vector by a power of 2 such 
that the largest magnitude 
element will just fit into a 
specified integer bit width, 
and then fix the scaled 
elements to integers. 


VECTOR SHIFT AND FIX 
To shift (multiply bv a power 
of 2) and then fix to integers 
the elements of a 
floating-point vector. 


VECTOR 8=BIT BYTE UNPACK 
To unpack four 8<bit unsigned 
bytes from each source vector 
word and store them in four 
destination words as 38-bit 
floating-point numbers. 


VECTOR 8=B3IT SIGNED BYTE UNPACK 
To unpack four 8-bit 2’s 
complement signed bytes from 
each source word and store then 
in four destination words as 
38-bit floating-point numbers. 


VECTOR 8=<8IT BYTE PACK 
To pack each four 38-bit 
floating-point numbers into one 
destination word as 8=—bit 
bytes. 


VECTOR 16-BIT BYTE UNPACK 
To unpack two 16-bit unsigned 
bytes from each source word and 
store them in two destination 
words as 38—bit floating-— point 
positive numbers. 


VECTOR 16=<BIT SIGNED BYTE UNPACK 
To unpack two 16-bit signed 2s 
complement bytes from each 
source word and store them in 
two destination words as signed 
38-bit floating-point numbers. 


VECTOR 16-BIT BYTE PACK 
To pack each two 38—bit 
floating-point numbers into one 
destination word as 16—bit 
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Routine 


= C address increment 
Element count (destination words) 


CALL VFLIT32(A,1,C0,K,N) 
Source vector base address 


= A address increment 


Destination vector base address 
C address increment 
Element count 


CALL VFIX32(A,1I,C,K,N) 

Source vector base address 

A address increment 
Destination vector base address 
C address increment 

Element count 


CALL: VSEFLT (A, [yC3RiN) 

Source vector base address 

A address increment 
Destination vector base address 


= C address increment 
= Element count 


Purpose 


bytes. 


VECTOR 32=BIT INTEGER FLOAT 
To float 32-bit signed 2’s 
complement integers and store 
them as 38=bit floating point 
integers. 


VECTOR 32=-3IT INTEGER FIX 
To fix floating-ooint numbers 
from -2147483648 to 2147483647 
and store them in a destination 
vector as 32-bit signed 2’s 
complement integers. 


VECTOR SIGN EXTEND AND FLOAT 
To extend the sign of a vector 
of l6<bit integers and convert 
them to floating-point numbers. 


CALL MTRANS (A,I,C,K,MC,NC) 
Source matrix base address 


= A address increment 


] 


Destination matrix base address 
C address increment 
= Number of rows of C 
(Columns of A) 
= Number of columns of C 
(rows of A) 


CALL MMUL(A,I,B,J,C,K,MC,NC,NA) 
Source matrix base address 

A address increment 

Source matrix base address 

B address increment 
Destination matrix base address 
C address increment 


MC = Number of rows in C 


(Rows in A) 


NC = Number of columns in C 


(Columns in B) 


NA = Number of columns in A 


(Rows in B) 


E-139 CALL MMUL32 (A,1I,B,J,C,X%,MC,NC,NA) 
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C 


MATRIX TRANSPOSE 
To transpose a matrix. 


MATRIX MULTIPLY 
To multiply two matrices. 


MATRIX MULTIPLY (DIMENSION <=32) 
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= Source matrix base address To multiply two matrices with 
= A address increment dimensions <=32. 
= Source matrix base address 
B address increment 
= Destination matrix base address 
= C address increment 
MC = Number of rows in C 
(Rows in A) 
NC = Number of columns in © 
(Columns in B) 
NA = Number of columns in A (<=32) 
(Rows in B) 


HAGWH PE 
MN 


tonal 


E-141 CALL MATINV(A,N) MATRIX INVERSE 
A = Source matrix base address To invert a matrix. 
A + N*N = Destination matrix base address 
N = Numbers of rows (and columns) in A 
F-143 CALL SOLVEQ(A,N,B,M, ROWADD, X, IERR) LINEAR EQUATION SOLVER 
A = Coefficient matrix base address . To solve a system of 
N = Number of rows (and columns) in A simultaneous linear equations. 
B = Base address of matrix of M N-element 


right hand sides 
M = Number of Neelement solution vectors 
ROWADD = Base address of 2*N-element work 
vector for row addresses 
X = Base address for matrix of M N-element 
solution vectors : 
IERR = Address of singularity value 


F-145 CALL MVML3(A,1,B,J,JP,C,K,KP,N) MATRIX VECTOR MULTIPLY (3X3) 
A = 3x3 matrix base address To multiply a 3x3 matrix by a 
I = A address increment series of 3-element column 
B = First source vector base address vectors. 
J = Increment between the three 


elements in each vector of B 
JP = Increment between the first 
element of each vector of B 
C = First destination vector base address 
K = Increment between the three 
elements in each vector in © 
KP = Increment between the first 
element of each vector in C 
N = Number of 3-element vectors 


H<147 CALL MVML4(A,1,B,J,JP,C,K,KP,N) MATRIX VECTOR MULTIPLY (4X4) 
A = 4x4 matrix base address To multiply a 4x4 matrix by a 
I = A address increment series of 4-element column 
B = First source vector base address vectors. 
J = Increment between the three 


elements in each vector of B 
JP = Increment between the first 
element of each vector of B 
C = First destination vector base address 
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K = Increment between the three 
elements in each vector in C 
KP = Increment between the first 
element of each vector in C 
N = Number of 4-element vectors 


E-149 CALL CTRN3(A4,3,J,JP,C,D,L,LP,N) 


A = 3x3 rotation matrix base address 
B = First source vector base address 


ni Increment between the three 
elements in each vector of 3 
JP = Increment between the first 
elements (x-coordinates) of 


each vector of 8B 


NW 


Oo 
i 


vector 
D = First destination vector base address 
L = Increment between the three 


elements in each vector in D 
LP = Increment between the first 
elements of each vector in D 
N = Number of 3-element coordinate 
vectors 


E-151 CALL FMMM(A,B,C,MC,NC,NA) 

A = Source matrix base address 

B = Source matrix base address 

C = Destination matrix base address 

MC = Number of rows in C 
(Rows in A) 

NC = Number of columns in C 
(Columns in 3B) 

NA = Number of columns in A 
(Rows in B) 


1 


ie 


153 CALL FMMM32(A,B,C,MC,NC,NA) 
A = Source matrix base address 
B Source matrix base address 
C 


il 


Destination matrix base address 
MC = Number of rows in C 
(Rows in A) 
NC = Number of columns in C 
(Columns in B) 
NA = Number of columns in A (<=32) 
(Rows in B) 


FFT OPERATIONS 


Base address of 3-element translation 


Purpose 


3=DIMENSTON COORDINATE TRANSFORMATION 
To transform a series of 
3<dimensional coordinates 
(translation and rotation). 


FAST MEMORY MATRIX MULTIPLY 
To multiply two matrices. 
(Available for 167 ns memory 
only.) 


FAST MEMORY MATRIX MULTIPLY (<=32) 
To multiply two matrices with 
dimensions <=32. (Available 
for 167 ns memory only.) 


E-156 CALL CFFT(C,N,F) 
C = Source and destination vector 
base address 
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COMPLEX TO COMPLEX FFT (IN PLACE) 
To perform an in-place complex 
forward or inverse fast Fourier 


Routine 


N = Complex element count (power of 2) 
F = Direction flag, +1 for forward 
-l for inverse 


E-158 CALL CFFTB(A,C,N,F) 


A = Source vector base address 
C = Destination vector base address 
1 = Complex element count (power of 2) 
= Direction flag, +1 for forward 
~l for inverse 


E-160 CALL RFFT(C,N,F) 


C = Source and destination vector 
base address 
N = Real element count (power of 2) 
F = Direction flag, +1 for forward 
-l for inverse 


162 CALL RFFTB(A,C,N,F) 

A = Source vector base address 

C = Destination vector base address 

N = Real element count (power of 2) 

F = Direction flag, +1 for forward 
-l1 for inverse 


E-164 CALL CFFTSC(C,N) 


C = Source and destination vector 
base address 
N = Complex element count (power of 2) 


E~165 CALL RFFTSC(C,N,F,FS) 


C = Source and destination vector 
base address 
N = Real element count (power of 2) 
F = Formatting flag 
1,0,-1 = No format change 
2 = Unpack RFFT result into N/2 
complex elements 


3 = Unpack RFFT result into N/2 +l 
complex elements 
-2 = Pack N/2 complex elements 


into RFFT format 
-3 = Pack N/2 + 1 complex elements 
into RFFT format 
FS = Scaling flag 
0 = No scaling 
1 = Multiply by 1/(2*N) 
~l = Multiply by 1/(44N) 


E-167 CALL CFFT2D(C,N1,N2,F) 


C = source and destination array address 

Nl = Number of columns = length of rows 

N2 = Number of rows = length of column 
(Note: N1*N2 <= 32768) 2 
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Purpose 


transform (FFT). 


COMPLEX TO COMPLEX FFT (NOT IN PLACE) 
To perform a not-in-place 
complex forward or inverse fast 
Fourier transform (FFT). 


REAL TO COMPLEX FFT (IN PLACE) 
To perform an in-place 
real-to-complex forward or a 
complex-to-real inverse fast 
Fourier transform (FFT). 


REAL TO COMPLEX FFT (NOT IN PLACE) 
To perform a not-in=place 
real=-to-complex forward or a 
complex-to-real inverse fast 
Fourier transform (FFT). 


COMPLEX FFT SCALE 
To scale complex-to-complex 
forward FFT results. 


REAL FFT SCALE AND FORMAT 
To scale real-to complex FFT 
results and/or change a complex 
vector between the special RFFT 
complex format and the normal 
complex vector format. 


COMPLEX TO COMPLEX 2=DIMENSIONAL FFT 
Two perform an in place complex 
two-dimensional FFT on 
rectangular arrays which occupy 
no more than 65536 main data 
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F = Forward-Inverse flag memory locations (one page). 
=-169 CALL RFFT2D(C,N1,N2,F) REAL TO COMPLEX 2-DIMENSIONAL FFT 
C = Source and destination array address To perform an in place real 
Nl = Number of columns = length of rows two-dimensional FFT on 
N2 = Number of rows = length of columns rectangular arrays which occupy 
(Note: NI*N2 <= 65536) no more than 65536 main data 
F = Forward-inverse flag memory locations (one page). 


E-172 CALL CONV(A,1I,3B,J,C,K,N,M) CONVOLUTION (CORRELATION ) 
A = Operand vector base address To perform a convolution or 
I = A address increment correlation operation on two 
B = Operator vector base address vectors. 
J = B address increment 
C = Destination vector base address 
K = C address increment 
N = Element count for C (result) 
M = Element count for B (operator) 
(Element count for A (operand) must 
be N+M-1) 
E-174 CALL DEQ22(A,1,B,C,K,N) DIFFERENCE EQUATION, 2 POLES, 2 ZEROS 
A = Source vector base address To perform a 2-pole, 2-zero 
I = A address increment recursive digital filtering 
38 = Base address of 5 filter coefficients difference equation on a 
C = Destination vector base address vector. 
K = C address increment 
N = Element count 
B=L75:(GALL VPOLY (Ay tebe JeCy Ri gtls Fe VECTOR POLYNOMIAL EVALUATION 
A = Coefficient vector base address To evaluate a vector 
(Highest order coefficient is first) polynomial. 
I = A address increment 
B = Source vector base address 
J = B address increment 
C = Destination vector base address 
K = C address increment 
N = Element count (of B and C) 
P = Order of polynomiai (>1) 
E-177 CALL VSUM(A,I,C,K,N,H) VECTOR SUM OF ELEMENTS INTEGRATION 
A = Source vector base address To integrate a vector by 
I = A address increment performing a running scaled sum 
C = Destination vector base address of the elements of the vector. 
K = C address increment 
N = Element count 
H = Address of integration step size 
E-178 CALL VITRAPZ(A,1,C,K,N,H) VECTOR TRAPEZOIDAL RULE INTEGRATION 
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Routine 


Source vector base address 


= A address increment 


Destination vector base address 
CG address increment 

Element count 

Address of integration step size 


CALL VSIMPS (A,1,C,K,N,#) 

Source vector base address 

A address increment 
Destination vector base address 


= C address increment 


Element count 


= Address of integration step size 


80 


LR 
R = Source vector base address 


G 


F 


A 


= 


CALL WIENER (LR,R,G,F,A, ISW) 


= Filter length 


(auto-correlation coefficients) 
Source vector base address 
(cross-correlation) 

Destination vector base address 
(filter weighting coefficients) 
Destination vector base address 
(prediction error operator) 


ISW = Algorithm switch 


0 for spike deconvolution 
1 for general deconvolution 


E-183 CALL HIST(A,1I,C,N,NB, AMAX, AMIN ) 


Ban LE 


Source vector base address 

A address increment 
Histogram vector base address 
Element count for A 


NB = Element count (bins) in C 


AMAX = Address of maximum histogram value 
AMIN = Address of minimum histogram value 


ies! 
j 
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CALL HANN(A,1I,C,K,N,F) 

Source vector base address 

A address increment 

Destination vector base address 
C address increment 

Element count (a power of 2) 
Normalization flag 


F=0 means unnormalized Hanning window 


(peak window value=1.0) 
F=l means normalized Hanning window 
(peak window value=1.63) 
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Purpose 


To integrate a vector by using 
the trapezoidal rule. 


VECTOR SIMPSONS 1/3 RULE INTEGRATION 
To integrate a vector by using 
Simpson’s 1/3 rule. 


WIENER LEVINSON ALGORITHM 
To solve a system of single 
channel normal equations which 
arise in least squares 
filtering and prediction 
problems. 
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SIGNAL PROCESSING OPERATIONS 


(optional) 


HISTOGRAM 
To perform a histogram on a 
vector. 


HANNING WINDOW MULTIPLY 
To multiply a vector by a 
Hanning window. 


Z3 


Page 


Routine 


CALL ASPEC(A,C,N) 


= Source complex vector base address 


Destination real vector base address 
Element count 


(Note vector elements occupy consecutive 
addresses.) 


E-187 
A = 
3 = 
C= 
N= 


CALL. CSPEC (A,B,C, NX) 

Source vector base address 
Source vector base address 
Destination vector base address 
Element count 


(Note vector elements occupy consecutive 
addresses.) 


E-188 
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CALL VAVLIN(A,I,3B,C,K,N) 
Source vector base address 


= A address increment 
= Address for number of vectors 


included in current average 
Averaged vector base address 
C address increment 

Element count 


CALL VAVEXP(A,I,B,C,K,N) 
Source vector base address 


= A address increment 
= Address for discount factor 


Averaged vector base address 


= C address increment 


Element count 


CALL VDBPWR(A,1I,3B,C,K,N) 


= Source vector base address 
= A address increment 
= Address of scalar reference (0 dB) 


value 
Destination vector base address 


= C address increment 
= Element count 


CALL TRANS (A,B,C,N) 


= Auto-spectrum base address (real) 
= Cross-spectrum base address (complex) 
Complex transfer function base address 


Element count 


(Note vector elements occupy consecutive 
addresses. ) 


Z-192 
A= 
B 
C 


FPS 


CALL COHER (A,B,C,D,N) 

Auto—spectrum base address (real) 
Auto-spectrum base address (real) 
Cross-spectrum base address (complex) 


860-7288-004 c - 


Purpose 


ACCUMULATING AUTO-SPECTRUM 
To perform accumulating 
auto=-spectrum calculation ona 
complex vector. 


ACCUMULATING CROSS=-SPECTRUM 
To perform accumulating 
cross—spectrum calculation on 
two complex vectors. 


VECTOR LINEAR AVERAGING 
To update the linear average of 
a sequence of vectors to 
include a new vector. 


VECTOR EXPONENTIAL AVERAGING 
To update the approximately 
exponential average of a 
sequence of vectors to include 
a new vector. 


VECTOR CONVERSION TO DB (POWER) 
To compute the decibel (power) 
equivalents of the elements of 
a vector, relative to a 
specified scalar value. 


TRANSFER FUNCTION 
To perform a complex transter 
function calculation by 
dividing the cross=spectrum by 
the auto-spectrum. 


COHERENCE FUNCTION 
To compute the coherence 
function, given the 
auto-spectra of two signals and 
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D = Coherence function base address (real) the cross-spectrum between 
N = Element count them. 

(Note vector elements occupy consecutive 

addresses.) 


E-193 CALL ACORT(A,C,N,M) .  AUTO-CORRELATION (TIME-DOMAIN) 
A = Source vector base address To perform an auto-correlation 
C = Destination vector base address operation on a vector using 
N = Element count for C (number of lags) time-domain techniques. 


M = Element count for A 
(Note vector elements occupy consecutive 
addresses.) 


E-195 CALL ACORF(A,C,N,M) AUTO-CORRELATION (FRECUENCY-DOMAIN ) 
A = Source vector base address To perform an auto-correlation 
C = Destination vector base address Operation on a vector using 
N = Element count for C (number of lags) frequency-domain (FFT) 
M = Element count for A (power of 2) techniques. 


(Note vector elements occupy consecutive 
addresses. Requires 2M words storage 


for A.) 
E~197 CALL CCORT (A,B,C,N,) CROSS-CORRELATION (TIME-DOMAIN) 
A = Source vector (operand) base address To perform a cross-correlation 
B = Source vector (operator) base address operation on two vectors using 
C = Destination vector base address time-domain techniques. 


N = Element count for C (number of lags) 

M = Element count for A and B 

(Note vector elements occupy consecutive 
addresses.) 


E-199 CALL CCORF(A,B,C,N,M) CROSS-CORRELATION (FREQUENCY-DOMAIN) 
A = Source vector (operand) base address To perform an cross-correlation 
B = Source vector (operator) base address Operation on two vectors using 
C = Destination vector base address frequency-domain (FFT) 
N = Element count for C (number of lags) techniques. 
M = Element count for A and B (power of 2) 


(Note vector elements occupy consecutive 
addresses. Requires 2M words storage for 
A and 2M words storage for B.) 


E~201 CALL TCONV(A,1I,B,J,C,X,N,M,L) | POSTTAPERED CONVOLUTION (CORRELATION) 
A = Source (operand) vector base address To perform a post-tapered 
I = A address increment (>0) . convolution or correlation 
B = Source (operator) vector base address Operation on two vectors. 
J = 8B address increment 
C = Destination vector base address 
K = C address increment 
N = Element count for C (result) 
“{ = Element count for B (operator) 
L = Element count for A (operand) 
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Purpose 


TABLE MEMORY OPERATIONS 


{optional ) 


CALL MTMOV(A,C,N) 

Source vector base address (MD) 
Destination vector base address 
Element count 


CALL TMMOV(A,C,X) 

Source vector base address (TM) 
Destination vector base address 
Element count 


CALL MTIMOV(A,I,C,K,N) 

Source vector base address (MD) 
A address increment 
Destination vector base address 


= C address increment 


Element count 


CALL TMIMOV(A,1I,C,K,N) 

Source vector base address (TM) 
A address increment 
Destination vector base address 
C address increment 

Element count 


CALL TIIMOV(A,1,C,K.N) 
Source vector base address (TM) 


= A address increment 


Wl 


Destination vector base address 
C address increment 
Element count 


CALL MMTADD (A,1I,B,J,C,K,N) 
Source vector base address (MD) 
A address increment 

Source vector base address (MD) 
B address increment 
Destination vector base address 
C address increment 

Element count 


CALL MMTSUB(A,1,B,J,C,K,N) 
Source vector base address (MD) 
A address increment 

Source vector base address (MD) 
B address increment 
Destination vector base address 


= C address increment 
= Element count 
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(T) 


(MD ) 


(T™) 


(MD ) 


(T™) 


(TM) 


(™) 


VECTOR MOVE WITH INCREMENT 


VECTOR MOVE (MD TO ™) 


To transfer elements of a 
vector from main data to table 
memory, where both vectors are 
stored compactly. 


VECTOR MOVE (TM TO MD) 


To transfer elements of a 
vector from table memory to 
main data memory, where both 
vectors are stored compactly. 


(MD TO 
To move elements of a vector 
from main data memory to table 
memory, where the increments 
between the elements are 
specified. 


VECTOR MOVE WITH INCREMENT (T™ TO 


To move elements of a vector in 
table memory to main data 
memory, where the increments 
between elements are specified. 


VECTOR MOVE WITH INCREMENT (TM TO 


To move elements of a vector 
within table memory. 


VECTOR ADD (MD+MD TO TM) 


To add the elements of two 
vectors in main data memory and 
store the results in a vector 
in table memory. 


VECTOR SUBTRACT (MD-MD TO T) 


To subtract the elements of two 
vectors in main data memory and 
store the results in a vector 
in table memory. 
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Routine 


CALL MMTMUL(A,1,B,J,C,K,N) 
Source vector base address (MD) 
A address increment 

Source vector base address (MD) 
B address increment 
Destination vector base address 
C address increment 

Element count 


CALL MTMADD (A,1,3,J,C,8,N) 
Source vector base address (MD) 
A address increment 

Source vector base address (TM) 
B address increment. 
Destination vector base address 
C address increment 

flement count 


CALL MTMSUB(A,1I,8,J,C,K,N) 


Source vector base address (MD) 


= A address increment 


Source vector base address (TM) 


= B address increment 


— 


Destination vector base address 
C address increment 
Element count 


CALL TMMSUB(A,1I,B,J,C,K,N) 
Source vector base address (TM) 
A address increment 

Source vector base address (MD) 
B address increment 
Destination vector base address 
C address increment 


= Element count 


— 
ni 


CALL MTMMUL (A,1,B,J,C,K,N) 
Source vector base address (MD) 


= A address increment 


Source vector base address (TM) 


= B address increment 


Destination vector base address 


= C address increment 


tr 
won 


Element count 


CALL MTTADD (A,I,3B,J,C,£,N) 
Source vector base address (MD) 


= A address increment 


Source vector base address (TM) 
B address increment 
Destination vector base address 
C address increment 

Element count 
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Purpose 


VECTOR MULTIPLY (MD*MD TO TM) 
To multiply the elements of two 
vectors in main data memory and 
store the results in table 
memory. 
(™) 


VECTOR ADD (MD+TM TO MD) 
To add elements of a vector in 
main data memory to elements of 
a vector in table memory and 
store the results in main data 
(MD ) memory. 


VECTOR SUBTRACT (MD-TM TO MD) 
To subtract the elements of a 
vector in table memory from the 
elements of a vector in main 
data memory and store the 
(MD) © results in main data memory. 


VECTOR SUBTRACT (TM=MD TO MD) 
To subtract the elements of a 
vector in main data memory from 
a vector in table memory and 
store the differences in main 
(MD ) data memory. 


VECTOR MULTIPLY (MD*IM TO MD) 
To multiply elements of a 
vector in main data memory by 
elements of a vector in table 
memory and store the products 
(MD ) in main data memory. 


VECTOR ADD (MD+TM TO TM) 
To add the elements of a vector 
in main data memory to elements 
of a vector in table memory and 
store the sums in a vector in 
(™) table memory. 
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Routine 


CALL. MITSUB (A, 1,8, J,0,K, 5) 
Source vector base address (MD) 
A address increment 


Source vector base address (TM) 


= B address increment 


~ 


Destination vector base address 
C address increment 
Element count 


CALL TMTSUB(A,1,3,J,C,K,N) 
Source vector base address (TM) 
A address increment | 
Source vector base address 
B address increment 
Destination vector base address 
C address increment 

Element count 


(MD ) 


CALL MTTMUL (A,1I,B,J,C,K,N) 
Source vector base address (MD) 
A address increment 

Source vector base address 
B address increment 
Destination vector base address 
C address increment © 

Element count 


(™) 


CALL TIMADD (A,1I,B,J,C,K,N) 
Source vector base address (TM) 
A address increment 

Source vector base address (TM) 
B address increment 

Destination vector base address 
C address increment 

Element count 


CALL TTMSUB(A,1,B,J,C,K,N) 
Source vector base address (TM) 
A address increment 

Source vector base address (TM) 
B address increment © 
Destination vector base address 
C address increment 

Element count 


CALL TTMMUL (A,I,B,J,C,K,N) 
Source vector base address (TM) 
A address increment | 

Source vector base address (TM) 
B address increment 

Destination vector base address 
C address increment 

Element count 
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(TM) 


(T™) 


CE) 


(MD ) 


(MD ) 


Purpose 


VECTOR SUBTRACT (MD-TM TO TM) 


To subtract the. elements of a 
vector in table memory from 
elements of a vector in main 
data memory and store the 
differences in table memory. 


VECTOR SUBTRACT (TM=-MD TO ™) 


To subtract the elements of 4 
vector in main data memory from 
the elements of a vector in 
table memory and store the 
results in table memory. 


VECTOR MULTIPLY (MD*TM TO TM) 


To multiply the elements of a 
vector in main data memory by 
the elements of a vector in 
table memory and store the 
products in table memory. 


VECTOR ADD (TM+IM TO MD) 


To add the elements of two 
vectors in table memory and 
store the sums in main data 
MEMOTyY. 


VECTOR SUBTRACT (TM=-TM TO MD) 


To subtract the elements of two 
vectcrs in table memory and 
store the difference in main 
data memory. 


VECTOR MULTIPLY (TM*TM TO MD) 


To multiply the elements of two 
vectors in table memory and 
store the products in main data 
memory. 
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Routine 


CALL TITADD (A4,1,B,J,C,K,N) 
Source vector base address (TM) 


= A address increment 


Source vector base address (TM) 


= B address increment 


to 
uo 


Destination vector base address 
C address increment 
Element count 


CALL. TTTSUB (As 1,435,056. RN) 
Source vector base address (TM) 
A address increment . 
Source vector base address (TM) 
B address increment 


= Destination vector base address 


bho 
uw 


C address increment 
Element count 


CALL TTTMUL (A,1,B,J,C,K,N) 
Source vector base address (TM) 
A address increment 

Source vector base address (TM) 
B address increment 


= Destination vector base address 


C address increment 
Element count 
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(T™) 


(TM) 


(T™) 


Purpose 


VECTOR ADD (TM+TM TO T4) 


To add the elements of two 
vectors in-table memory and 
store the sums in a third 
vector in table memory. 


VECTOR SUBTRACT (TM=-IM TO IT) 


To subtract the elements of two 
vectors in table memory and 
store the differences in a 
vector in table memory. 


VECTOR MULTIPLY (TM*TM TO T™) 


To multiply the elements of two 
vectors in table memory and 
store the products in a vector 
in table memory. 
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APPENDIX D 


AP-FORTRAN ROUTINES 


Many of the routines in the AP Math Library are available for use in 
AP-FORTRAN program units. These routines contain alternate entry 
points permitting AP-FORTRAN program units to call them. Because of 
these alternate entry points, the routines are called by different 
names under AP-FORTRAN. A list cf the routines’ names and their 
corresponding AP-FORTRAN calling names is contained in Table D-l. This 
table lists all routines callable from AP-FORTRAN program units. The 
parameters associated with the routines are described in Appendices © 
and E. Regarding the associated parameters, the AP-FORTRAN user should 
be aware of the following: 


e The data transfer and control operations and the 
APAL-callable utility operations are not available under 
AP-FORTRAN. The data transfer and control operations are 
not needed by the AP-FORTRAN user. The AP-FORTRAN program 
unit executes in the AP, and thus transferring data and 
controlling operations are already provided for. The 
AP-FORTRAN programmer does not need to place data into the 
AP using APPUT or retrieve it using APGET; the data can be 
made available to the routines by passing common blocks or 
defining values in the AP-FORTRAN program unit. 


e In Appendices C and E, when parameters are described as 
base addresses, the AP-FORTRAN user snould subscitute che 
term "name". For example, the term "source vector base 
address" translates into "source vector (or array) name". 
The name specified should be the name of a properly 
dimensioned array. 


® Parameters which are described as values in Appendices C 
and E can be specified as variable names under AP-FORTRAN. 
All routines are called by reference under AP-FORTRAN- 
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Table D-1 AP-FORTRAN Callable Math Library Routines 


QUT INE JESCRIOTION AP-FORTRAN CALLABLE NAME 


Serene ee eee ere eee ee emussnpmnnn sanamourenapunnesnseamnsnstannessenstbediscain ravi saint 


ACORF Auto-correlation ‘ frequency-domain) FFACOR(a,c,n,m) 

ACORT Auto-correlation (time-domain) | FTACOR(a,c,n,m) 

ASPEC Accumulating auto-spectrum FASPEC(a,c,n) 

CCORF Cross-correlation (frequency-domain) | FFCCOR(a,b,c,n,m) 

CFFT Complex to complex FFT (in place) FCFFT(c,n,f) 

CFFTB Complex to complex FFT (not in place) FBCFFT(a,c,n,T) 

CRVDIV Complex and real vector divide FCRVDI (a,i.b.J,c,k»n} 

CICOMB . . | FCVCMB(a,i,b,J.¢,k,n} 

CYCONG FCVCNd(a,i,c,k,n) 

CVFILL Complex vector fill FCVFIL(a,c,kn) 

CVREAL From complex vector of reals FCVREAL(a,i,¢,k,n) ag 
cvSMUL FCWSMUCaTByesKA) 
DAWRIT Write device address register FDAWRT (da,val) 

DEQ22 Difference equation, 2 roles. 2 zeros EDEQ22(a,1,9,C,«,n) aa 


DOTPR Dot product FDOTPR(a,i,d,Jj,¢.9) 
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Table D-1 AP-FORTRAN Callable Math Library Routines (cont. ) 


ROUTINE DESCRIPTION AP-FORTRAN CALLABLE NAME 


eCYMUL Extended complex vector multiply FECVMU(ah,al,i,bnh,91,3,ch,cl,k,nn,nl,f) 
EDOTPR Extended dot croduct FEDTPR(ah,al,i,th,b1.3,ch,cl,nh,n1) 
EMMUL Extended matrix multiply FEMMUL (ah,al,dh,01,ch,¢1,mc,nc,na) 
EMTRAN Extended matrix transpose FEMTRN(ah,al,ch,cl.«,mc,nc) 

EVADO Extended vector add FEYADD.(ah,al,i,bh,bl,j,cn,cl,k,ah,c}) 
EVCLR extended vector clear FEVCLR(ch,cl,k,nn,nl) 

EVDIV Extended vector divide FEVOIV(ah,al,i,bh,bl,j,ch,cl,k,nh,ni) 
EYMOV Extended vector move FEVMOV (ah,al,i,ch,cl,k,nn,nl) 

EVMUL Extended vector multiply FEVMUL (ah,al,i,bh,b1,j,ch,cl,x,nh,ni) 


EVSUB Extended vector subtract FEVSUB(ah,al,i,bh,b1,j,ch,cl,k,nh,n1) 
EVSWAP Extended vector swap — FEVSWP(ah,al,i,ch,cl,k,nh,nl) 


FMMM Fast memory matrix multiply . FFMMM(a,b,c,mc,nc,na) 


FMMM3 2 Fast memory matrix multiply (<#32) FFMM32(a,b,c,mc ,nc,na) 


HANN Hanning window multiply FHANN(a,i,C,k,n,f) 

HIST FHIST(a,i sc sn ynd,hmax amin) 

(heer FORT (ema aon 

TOPWO Wait for I0P data transfer FTOPWD = ee 
MATIN SMATIN(a 7) 

a SURED 

MEAMGV Mean of vector element magnitudes FMEMGV (a,i,¢,n) 

MINV Minimum element in vector FMINV(a,i,c,n) 

MMTMUL Yector multiply (MD*MD to TM) FMMMT (a,1,9,3.¢,k,n) 

MMT SUB Vector subtract (MD-MD to TM) FSMMT(a,1,9,Jj,¢,k,n) 


MMUL Matrix multiply FMMUL (a,i1,6,j.c,k,mc,nc,na) 
MMUL32 Matrix multiply (dimension<=32)  . FMMU32(a,i,b,J,c.k.mc,nc,na) 
MT IMOV Vector move with increment (MD to T™) | FMTIMO(a,i,c,k,n) 

MTMADD Vector add (MD+TM to MD) FAMTMD(a,i,b,jsc.ksn) 


A TENE 
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Table D-1 


5 
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aA 
nn 

mW 
~4y 
uw 
cae) 


RMSQV 
SCUMA 
SETCS 
SETPG 
SCLYVEO 
SVE 
SVEMG 
SVESQ 
SVS 
TCONV 
TMIMOV 
TMMOV 
TMMSUB 
TMTSUB 
TRANS 
TT IMOV 
TTMAOD 
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AP-FORTRAN Callable Math Library Routines (cont.) 


DESCRIPTION 


Yector multioly (MD*™ +o 4D) 
Yector move (MD to 7M) 

Yector suotract (M2-T to MD) 
Matrix transpose 

Vector add (MO+TM to TM) 

Vector multiply (MD*T to T™) 
Yector subtract (MD-TM to 1) 
Matrix vector multiply (3x3) 
Matrix vector multiply (4x4) 
Rectangular to polar conversion 
Read control bit 5 interrupt 
Read parity registers 

Read memory page from AP 

Polar to rectangular conversion 


Real to complex FFT (in place) 


~A 
o 
Oo 
uo 


Real to complex FFT (not in place) 


Real FFT scale and format 


Root-mean-square of vector elements 


Self-conjugate multiply and add 


Set control bit 5 interrupt 
Set memory page for AP 
Linear equation solver 


Sum of vector elements 


Sum of vector element magnitudes 


Sum of vector element squares 


Sum of vector signed squares 


Post-tapered convolution (correlation) 


Vector move with increment (7M to MD) 


Yector move (T™ to MD) 


Vector subtract (TM-MD to MD) 


Vector subtract (TM-MD to TM) 


Transfer function 


Vector move with increment (TM to T™) 


Vector add (TM+TM to MO) 


04 dD - 


AP-FORTRAN CALLABLE NAME 
FMMTMD(a,7,5,3,¢,k,n) 
FMTMOQY (a,c4n) 
FSMTMO(a,i,b,4.¢,k«n) 
FMTRNS(a,1,¢,k,mc,nc) 
FAMTT (a,1,5,j,¢,k,n) 
FMMTT (a,i1,5,J3,¢,k,n) 
FSMTT(a,i,6,3,c,k,n) 


FMVMU3(a,1,6,5,jp.¢,k,kp,n) 
FMVML4(a,i1,b,3,dp,c,k,kp,n) 
FPQLAR(a,i,¢,k,n) | 
FROCS(c) 

FRDPAR(c) 

FRDPG(c) 

FRECT(a,i,c,k,n) 

FRFFT (c,n.f) 

FBRFFT(a,2,n,f) 

ECRFET Coins? sts) 

FRMSQV (a,i,c,n) 
FSCUMA(a,i1,5,j,c,k,n) 

FSETCS 

FSETPG(mask,apmae,mae} 
ESOVEQ(ayn,bsm,rowadd,x,ierr) 
FSVE(a,i,c,n) — 
FSVEMG(a,i,c,n) 
FSVESQ(a,i,c,n) 

FSVS(a,i,c,n) 
FTCONV(a,i,b,j.c.k.n.m,]) 
FTMIMO(a,i,c,k,n) 

FTMMOV (a,c,n) 
FSTMMD(a,i,b,j,c,k,n) 
FSTMT(a,i1,b,Jj,c,k,n) —_ 
FTRANS(a,4,c,n) 
FTTIMO(a,i,c,k.n) 
FATTMD(a,i4b.J.cyken] 


a8gl4 


Table D-1 AP-FORTRAN Callable Math Library Routines (cont.) 


ROUTINE | DESCRIPTION AP-FORTRAN CALLABLE NAME 


a A aT NLT IE IE TE TS IS LONI IES EE BE ET 


TTMMUL Vector multiply (T*T to MD) FMTTMD(a,i1,6,3,¢,k,n) 
TTMSUS8 , Yector subtract (TM-T to MD) FSTTMD(a,7.b,j,¢,k.n) 


TITAQD Yector add (TM+TM to TM) FATTT(a,i1,b,j,¢,k,n} 


TTTMUL Vector multiply (TM*IM to TM). FMITT(a,i,b,J,cykon} 
TTTSUB Yector subtract (TM-TM to TM) FStIT(a, 7,6, ).¢,ksn) 


YABS Vector absolute value FYABS(a,i,c,k,n) 
YADD Vector add FVA0D(a,i,6,j.c,k,n) 


VALOG Vector antilogarithm (base 10)° FVALOG(a,i,c,k,n) 


VDIV Vector divide FYDIV(a,i,b,j,c,k,n) 
VEQV Yector logical equivalence FYEQV(a,i,b,j,c,k,n) 


VFIX Vector integer fix FVFIX(a,i.c,k,n) 


VIMAG Extract imaginaries of complex vector FYIMAG(a,i,c,k,n) 


VINT Vector truncate to integer rVINT(a,i,¢,k,n) 


YAAM Yector add, add, and multiply FVAAM(a.i1,5,3,0,%,d,1,e,m,n) 


YOBPWR Vector conversion to 0B (power) FYDBPR(a,i,5,C,k,n) 


VLIM Vector limit - FVLIM(a,i,b,c,d,1,n) 


VLN Vector natural logarithm FVLN(a,i,c,k,n) 
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Table D-l 


ROUTINE 


VLOG 
YMA 
MAX 
YMAXMG 
YMIN 
YMINMG 
VMMA 
VMMSB 
YMOV 
YMSB 
MUL 
YNEG 
VOR 
YPKi6 
YPK8 
¥POLY 
VRAMP 
{RAND 
YREAL 
/SAD0 
¥SBM 
YSBSBM 
¥SCALE 
¥SCSCL 
VSEFLT 
YSHFX 
¥YSIMPS 
YWSIN 
VSMSA 
VSMUL 
VSQ 
¢SQRT 
YSSQ 
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CESCRIPTION 


Yector logarithm (base 10} 


Vector muitiply and add 
Vector maximum 


Yector maximum magnitude 


Yector miminum 
Vector minimum magnitude 


Yector multiply, multiply and add 


Vector logical or 


Vector 16 bit byte pack 

Vector 8 bit byte pack 

Yector polynomial evaluation 
Vector ramp 

Vector random numbers 

Extract reals of complex vector 


Vector scalar add 


Yector subtract and multiply 


Vector subtract, subtract and multiply 
Vector scale (power 2) and fix ; 


Vector scan, scale (power 2) and fix 


Vector sign extend and float 


Vector shift and fix 


Yector Simpson's 1/3 rule integration 


Vector sine 


Vector scalar multiply and scalar add 
Vector scalar multiply 


Vector square 


Yector square root 


Yector signed square 


Vector subtract 


AP-FORTRAN Callable Math Library Routines (cont.) 


AP-FORTRAN CALLABLE WAME 


FYLOG(a,i,c,k,n) 
FVMVA(a,i,5.d.c.k.d,7.n)_ 
FYMAX(a,i,b,3,¢,k,7) 
FVMGAX(a.i,0,4,¢,k,n) 
FYMIN(a,1,b,3,¢,k,n) 
FYMGIN(a,i1,9,J3,¢,k,n) 
FYMMA(a,i,6,j,¢,k,d,1,@,m,n) 
FVMMSB(a,i,5,J,cek.d,l,e.m.n) 
FYMOV(a,i,c,k,n) 
FVMSB(a,i,d.3,¢,k,d,1,n) 
FYMUL (a,71,5,j,¢,«,n) | 
FVNEG(a,i,c,k,n} 
FVOR(a,i,0,j,¢,k.n) 
FYPK16(a,i,c,k,n) 
FVPK8(a,i,c,k,n) 
FYPOLY(a,i,b,Jj,¢,k,n,9) 
FYRAMP (a,b,¢,k,n) 
FYRAND(a,c,k,n) 
FYREAL(a,7,¢,«,n) 
FVSADD(a,i,b,c,k,n) 
FVSBM(a,i,b,j,c,k,d,1,n) 


FVSB2M(a,7,b,j,¢c,k,d,1,e,m,n) 
FVSCLE(a,i,c,ksnsnb) 
FYSNSL (a,i,c,k,n,wath) 
FVSEFL(a,i,c,kya) 
FYSHFX(a,i,c,k,n,ns) 
FVSIMP(a,i,¢,k,n,h) 
FYSIN(a,i.c,k,n) 
FYSMSA(a,i,b,c,d,1,n) 
FYSMUL (a,i,5,c,k,n) 
FYSQ(a,i,c,k,n) 

FYSQRT (a,i,c,k,n) 
FYSSQ(a,i,c,k,n) 
FVSUB(a,i4b,j.c,k,n) 
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Table D-1 AP-FORTRAN Callable Math Library Routines (cont.) 


ROUT INE CESCRIPTION AP=FORTRAN CALLABLE NAME 


VTRAPZ Vector trapezoidal rule ineeguation | FVTRAP(a,i.c.k.n4i) 
VTSMUL Vector table scalar multiply | FVTSMU(a,i,0.¢,k4n) 
VUP16 | Yector 16 bit signed byte unpack FvUSi6(a.i.c,k.n) 
VUPS8 Yector 8 bit signed byte unpack | FVUPS8(a,i,c,k,n) 
WIENER Wiener Levinson algorithm FWIENR(Ir,r.g,f,a,iSw) 


ZMD Clear all pages of main data memory FZMD 
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Notice to the Reader 


e Help us improve the quality and usefulness 
of this manual. 


e Your comments and answers to the following 
READERS COMMENT form would be appreciated. 


ERE ED 
LRT De RReSEE 


To mail: fold the form in three parts so 
that Floating Point Systems' 
mailing address is visible; seal. 


Thank you 


READERS COMMENT FORM 


Document Title 


Your comments and answers will help 
us improve the quality and usefulness 
of our publications. If your answers 
require qualification or additional 
explanation, please comment in the 
space provided below. 


How did you use this manual? 


AS AN INTRODUCTION TO THE SUBJECT 
AS AN AID FOR ADVANCED TRAINING 
TO LEARN OF OPERATING PROCEDURES 
TO INSTRUCT A CLASS 

AS A STUDENT IN A CLASS 

AS A REFERENCE MANUAL 


Fe Ee ce cat Te att TE et ee ei 
eee eee” 


OTHER applicable, please refer to specific 
page numbers. 
Page Description of error or deficiency 
From: 

Name 5 Oe i er i ee em Meee ee eatery 

Firm Department 

Address CV SC Ca 

Telephone TG je ae 


Did you find this material... 


YES NO 


USEFUL? ( 
COMPLETE? ( 
ACCURATE? ( 
WELL ORGANIZED? ( 
WELL ILLUSTRATED? ( 
WELL INDEXED? ( 
EASY TO READ? ( 
EASY TO UNDERSTAND? ( 


weer ase ase eee” 
om es ™  ™O™ ™ OO 
ee tel 


Please indicate below whether your 
comment pertains to an addition, 
deletion, change or error; and, where 


First Class 
Permit No. A-737 


Portland, 
Oregon 


BUSINESS REPLY 


No postage stamp necessary if mailed in the United States 


Postage will be paid by: 


FLOATING POINT SYSTEMS, INC. 


P.O. Box 23489 
Portland, Oregon 97223 


Attention: Technical Publications 


FLOATING POINT 
SYSTEMS, — INC. 


CALL TOLL FREE 800-547-1445 
PO. Box 23489, Portland, OR 97223 
(503) 641-3151, TLX: 360470 FLOATPOINT PTL 


